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SECTION  I 


INTRODUCTION 


There  are  no  standard  objective  quantitative  means  of  evaluating  currently 
available  microcomputer-based  hazard  response  models.  A  number  of  such  models 
have  been  recently  proposed,  and  many  of  them  Include  up-to-date  algorithms  on 
important  scientific  phenomena  such  as  evaporative  emissions,  dense-gas 
slumping,  and  transition  to  nonbuoyant  dispersion.  A  few  data  sets  exist  for 
testing  these  models  but  they  have  not  been  tested  or  compared  with  the  data 
on  the  basis  of  standard  statistical  significance  tests.  A  review  of  current 
vapor  cloud  models,  field  data  sets  and  some  examples  of  hazard  response  model 
evaluations  is  given  by  Hanna  and  Drivas  (Reference  1). 

The  U. S.  Air  Force,  among  others,  has  increased  emphasis  on  calculating 
"toxic  corridors”  due  to  potential  release  of  hazardous  chemicals.  The  Ocean 
Breeze/Dry  Gulch  (OB/DG)  model  was  originally  used  for  calculating  these 
corridors,  and  contains  an  estimate  of  model  uncertainty.  However,  the  OB/DG 
model  does  not  account  for  dense  gas  slumping  or  transient  releases.  Kunkel 
(Reference  2)  has  developed  an  improved  model  (AFTOX)  that  accounts  for  many 
of  these  phenomena  (but  not  dense  gases).  The  generation  of  models  is  more 
advanced  scientifically  than  the  OB/DG  model,  but  the  new  models  do  not 
account  for  uncertainty.  The  Phase  I  research  described  below  leads  to  a 
preliminary  quantitative  means  of  assessing  this  uncertainty  and  evaluating 
these  models. 

The  Phase  I  research  is  intended  to  determine  the  feasibility  of  the 
research  program,  which  then  may  be  carried  out  in  a  comprehensive  fashion  in 
Phase  II.  In  this  case,  the  Phase  I  research  has  had  the  objectives  of 
reviewing  the  literature  on  hazardous  response  modeling  uncertainty, 
developing  a  framework  that  accounts  for  the  three  components  of  the 
uncertainty  (model  physics  errors,  data  input  errors,  and  stochastic 
fluctuations),  and  applying  the  preliminary  procedures  to  several  models  using 
data  sets  such  as  the  Prairie  Grass  data,  the  Ocean  Breeze/Dry  Gulch  data,  the 
Green  Glow  data,  and  the  Thorney  Island  data.  The  research  has  attempted  to 
answer  the  following  questions: 


o  Do  suitable  data  sets  exist  for  use  In  evaluating  hazardous 
response  models? 

o  What  are  the  errors  In  the  data  used  for  Input  to  models? 

o  Is  it  possible  to  obtain  a  number  of  current  models  for 
evaluation  purposes? 

o  Can  a  model  evaluation  framework  be  developed  that  accounts  for 
all  the  components  of  model  uncertainty,  including  stochastic 
fluctuations? 

o  Can  the  models  properly  handle  the  effects  of  sampling  and 
averaging  times  and  distances  of  concentration  measurements? 

o  What  are  the  confidence  bounds  on  model  evaluation  statistics 
such  as  the  mean  square  error?  Are  they  small  enough  to  permit 
the  relative  performance  of  two  or  more  models  to  be  distinguished? 

The  preliminary  study  conducted  as  Phase  I  of  this  research  plan  has 
resulted  in  a  set  of  conclusions  and  recommendations  concerning  the 
applicability  of  the  methods  developed  to  quantify  the  uncertainty  in 
hazardous  response  modeling.  To  arrive  at  this  result,  the  following  work 
tasks  have  been  completed. 

Task  1:  Literature  Review;  Collection  of  Models  and  Data  Sets. 

Task  2:  Study  Components  of  Model  Uncertainty. 

Task  3:  Develop  Framework  of  Model  Evaluation  Procedure. 

Task  4:  Perform  Preliminary  Application  of  Procedure. 

The  results  of  these  tasks  are  given  in  Sections  II  through  V,  and 
conclusions  and  recommendations  are  given  in  Section  VI. 
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SECTION  II 


LITERATURE  REVIEW 


The  first  task  in  any  research  program  is  a  review  of  the  literature,  so 
that  all  relevant  work  by  other  researchers  Is  considered.  This  review  has 
included  the  collection  of  literature  describing  appropriate  hazardous 
response  models  and  data  sets.  In  the  case  of  the  Air  Force  Toxic  (AFTOX) 
model  systems  (Reference  2)  the  model  and  the  test  data  sets  were  obtained 
directly  from  the  author.  Other  reviews  include  those  by  Kunkel  (References  3 
and  4),  Carney  (Reference  5),  Ermak  and  Merry  (Reference  6)  and  Hanna  and 
Drivas  (Reference  1).  In  addition,  Spicer  and  Havens  (Reference  7)  have 
evaluated  three  models  with  USAF/NgO^  test  data  for  the  Air  Force  Engineering 
and  Service  Center  (AFESC),  Key  and  Bowman  (Reference  8)  have  developed  and 
tested  the  HARM  model  for  hazardous  response  modeling,  and  other  DOD  groups 
have  been  developing  similar  models  (for  example,  the  HAZZARD  model  at  Dugway 
Proving  Ground  and  the  D2PC  model  (Reference  9)  at  Aberdeen  Proving  Ground). 
Table  1  summarizes  some  additional  hazard  response  model  evaluation  exercises 
found  in  the  literature.  More  detailed  reviews  of  these  references  and  of 
available  models  and  data  sets  are  provided  below. 

A.  REVIEW  OF  SPECIFIC  USAF  HAZARD  MODELING  EXPERIENCE 

Because  the  U. S.  Air  Force  handles  many  toxic  chemicals  it  needs  to 
estimate  the  atmospheric  impact  of  releases  of  such  chemicals.  The  25  years 
of  USAF  research  on  this  topic  are  reviewed  below. 

1.  OB/DG  Model  Developments 

The  Ocean  Breeze/Dry  Gulch  (OB/DG)  model  (Reference  27)  was 
developed  for  use  in  support  of  rocket  fuel  handling  operations  at  Cape 
Canaveral  and  Vandenberg.  Dispersion  data  were  collected  at  those  two  sites 
(Cape  Canaveral,  Florida  *  Ocean  Breeze;  Vandenberg  AFB,  California  =  Dry 
Gulch)  and  at  the  Prairie  Grass,  Kansas,  site  during  the  1950s  and  1960s 
(References  28  and  29).  These  data  were  used  to  develop  a  purely  empirical 
correlation  known  as  the  OB/DG  model: 


3 


TABLE  1.  SOME  EXAMPLES  OF  HAZARDOUS  GAS  DISPERSION  MODEL  EVALUATIONS 


Authors 


Models 


Data  Sets 


Lay land  et  al. 
(Reference  10) 

Paine  et  al. 

(Reference  11) 
Helnold  et  al. 
(Reference  12) 

Ermak  et  al. 

(Reference  13) 
Ermak  and  Chan 
(Reference  14) 

McRae  (Reference  15) 

Alp  et  al. 

(Reference  16) 


INPUFF  2.0,  DEGADIS, 
OME,  PUFF 

AIRTOX 

Gaussian,  SLAB 
FEM3 

OB/DG,  Gaussian 
COBRA,  HEGADAS 


Eagle  ( N„0 . ) ,  Thorney  Island 
(Freon)  *  4 

Frenchman  Flat  (NH_), 

Thorney  Island  (Freon), 

Burro  (LNG),  Coyote  (LNG) 


Burro  (LNG),  Eagle  (N^) 


Eagle  (N204) 

Maplln  Sands  (LNG) 


Rlou  and  Saab 
(Reference  17) 


Box,  MERCURE-GL 


Thorney  Island  (Freon) 


Balentlne  and  Eltgroth 
(Reference  18) 

Lewellen  et  al. 
(Reference  19) 

Wheatley  et  al. 

(References  20  and  21) 


CHARM 

MESO  models,  Gaussian, 
ADPIC,  IMPACT,  others 

Picknett,  DENZ 


Burro  (Lgn).  Eagle  (N204) 
INTEL  SFg  data 
Thorney  Island  (Freon) 


Puttock  and  HEGADIS 

Colenbrander 

(References  22  and  23) 


Maplln  Sands  (LNG), 
Thorney  Island  (Freon) 


Fay  and  Ranck 
(Reference  24) 


Their  own  model 


Porton  (Freon),  van  Ulden 
data,  Thorney  Island  (Freon) 


Havens  and  Spicer  DEGADIS 

(Reference  25) 


Burro  (LNG),  Maplln  Sands 
(LNG),  Thorney  Island 
(Freon),  Welker  (LPG) 


Spicer  and  Havens 
(Reference  26) 


DEGADIS,  OB/DG, 
Gaussian 


Eagle  (N204) 
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C  /Q  *  0.00211  x"1-96  cr  "°-506  (AT  ♦  10)4,33  (1) 

p  u 

.1  qq  A  92 

or  C  / Q  «  0.000175  x  (AT  ♦  10)  (2) 

P 

_3 

where  the  ratio  of  the  concentration  to  the  source  strength  Cp/Q  is  in  s  m  , 
the  downwind  distance  x  is  in  a,  the  standard  deviation  of  wind  direction 
fluctuations  <r  is  in  deg,  and  AT  is  defined  as  the  temperature  difference 

O 

(°F)  between  the  54  ft.  and  6  ft.  levels  on  a  tower.  Wind  speed  is  absent 
because  it  is  strongly  correlated  with  AT.  Equation  (2)  accounts  for  the 
strong  correlation  between  o-  and  AT.  Stabilities  ranged  from  neutral  to 
unstable  during  most  of  these  tests. 

Predictions  of  Equation  (1)  are  compared  with  321  observations 
in  Figure  1,  showing  that  72  percent  of  the  predictions  are  within  a  factor  of 
two  of  the  observations,  and  97  percent  are  within  a  factor  of  four  (Reference 
27).  The  OB/DC  model  was  derived  using  a  special  subset  of  data  taken  from 
the  Ocean  Breeze  and  Dry  Gulch  experiments.  Another  subset  of  data  from  the 
same  experiments  is  used  in  Figure  1.  This  information  on  model  variability 
is  used  by  the  0B/DG  equation  to  build  uncertainty  into  the  model.  For 
example,  the  model  predicts  the  "toxic  corridor  length, "  or  distance  from  the 
source  until  a  certain  concentration  is  reached.  Based  on  the  variability 
discussed  above,  the  "95th  percentile"  toxic  corridor  length  is  about  two 
times  the  "50th  percentile"  or  median  length.  The  meaning  of  "95th 
percentile"  length  is  the  length  such  that  95  percent  of  observations  would  be 
expected  to  be  less  than  that  value  for  a  given  set  of  input  parameters. 

2.  Evaluations  of  0B/DG  Model 

The  U. S.  Air  Force  Scientific  Advisory  Board  Ad  Hoc  Committee  on 
Dispersion  of  Denser  than  Air  Gases  recommended  on  9  November  1983  that  the 
0B/DG  model  be  evaluated  and  possibly  replaced  with  a  current  state-of-the-art 
dispersion  model.  The  U. S.  Air  Force  supported  two  specific  reviews  of  the 
0B/DG  model  (References  30  and  31).  Ohmstede  et  al.  (Reference  30)  compared 
the  model  predictions  with  observations  at  several  sites  and  recommended  thit 
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Observed  vs.  Predicted  C  /Q:  Independent  Data  Test  of  Final 
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Diffusion  Prediction  Equation  (1)  (Reference  9) . 
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the  OB/DG  model  be  used  only  for  small  releases  during  the  daytime  at  downwind 
distances  less  than  about  5  km.  They  point  out  that  the  model  Is  useful  only 
for  near-surface  continuous  point  source  releases  of  neutrally  buoyant  gases. 

Kunkel  (Reference  31)  also  found  problems  with  the  OB/DG  model 
at  night,  since  few  nighttime  data  were  Included  in  the  derivation  of  the 
model.  For  light-wind  stable  conditions,  the  model  underpredicts.  However, 
for  windy  stable  conditions,  the  model  overpredlcts  by  a  factor  of  about  four. 

3.  TOXCOR  Model 

The  Air  Weather  Service  modified  the  OB/DG  model  for  application 
to  a  wide  variety  of  chemical  sources.  Clewell  (Reference  32)  and  Kahler  et 
al.  (Reference  33)  describe  the  TOXCOR  model,  which  is  basically  the  OB/DG 
model  with  semi  empirical  corrections  for  the  molecular  weight  and  vapor 
pressure  of  the  chemical.  Hydrazine,  monomethyl  hydrazine  and  NgO^  are 
included  in  this  model. 

4.  AFTOX  Model  Development 

Following  the  9  November  1983  recommendation  of  the  Scientific 
Advisory  Board  Ad  Hoc  Committee  on  Dispersion  of  Denser  than  Air  Gases,  Bruce 
Kunkel  of  AFGL  began  development  of  a  dispersion  model  based  more  on  the 
current  state  of  the  art.  The  resulting  model,  called  AFTOX,  is  described  in 
a  user’s  guide  prepared  by  Kunkel  (Reference  2),  who  includes  test  cases  and 
comparisons  with  the  OB/DG  model.  This  development  began  with  an  evaluation 
of  existing  spill  evaporation  models  (Reference  3)  and  transport  and 
dispersion  models  (Reference  31).  The  latter  report  considered  the  OB/DG,  the 
Shell  SPILLS  model  and  a  modified  Shell  SPILLS  model.  The  modification 
consists  mainly  of  the  addition  of  an  algorithm  from  Smith  (Reference  34)  that 
permits  continuous  stability  classes  to  be  used.  The  resulting  predictions 
are  much  smoother  functions,  as  seen  in  Figure  2.  This  modified  Shell  model 
evolved  into  the  AFTOX  model,  which  has  a  Gaussian  basis  and  can  be  applied  to 
both  continuous  and  Instantaneous  sources.  Because  it  does  not  apply  to  dense 
gas  sources,  its  application  is  intended  for  Air  Force  sites  without  the 
potential  for  large  releases  of  dense  gases. 
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Figure  2.  Model  Estimates  of  the  Hazard  Distance  for  Benzene  as  a  Function 
of  Wind  Speed  for  a  High  (50°)  and  Low  (20°)  Sun  Elevation 
Angle.  The  source  strength  is  1  kg/sec  and  the  concentration  of 
interest  is  30  ag/m  .  The  light  solid  line  represents  the  OB/DC 
aodel ,  the  heavy  solid  line  represents  the  Shell  Model,  and  the 
dashed  line  represents  the  Modified  Shell  Model.  The  letters 
represent  the  Pasqulll  stability  category  used  in  the  Shell 
Model  (Reference  13}. 
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The  report  by  Kunkel  (Reference  2)  compares  the  AFTOX  model  with 
the  OB/DG  model  for  the  Ocean  Breeze,  Dry  Gulch,  Prairie  Grass,  and  Green  Glow 
field  studies  (these  same  data  are  evaluated  later  in  Section  V).  The  AFTOX 
model  is  shown  to  fit  the  Ocean  Breeze,  Dry  Gulch  and  Prairie  Grass  data  as 
well  as  the  OB/DG  model  (which  was  derived  from  these  same  data),  and  performs 
better  at  the  Green  Glow  site.  The  figures  presented  in  Reference  2  show  the 
typical  AFTOX  model  error.  For  example,  Figure  3  shows  that,  at  any  given 
observed  C/Q,  there  is  a  range  of  about  an  order  of  magnitude  in  the  predicted 
C/Q. 


5.  Models  for  Sites  where  Liquid  Fhropellant  may  be  Spilled. 


The  OB/DG  model  was  only  of  limited  value  at  many  sites  where 

large  amounts  of  liquid  propellant  were  stored.  A  few  accidental  spills 

occurred,  and  the  resulting  plumes  were  observed  to  have  a  variety  of 

behaviors,  ranging  from  dense  gas  slumping  to  cases  where  the  cloud  rose  up 

several  hundred  meters  in  the  atmosphere.  This  behavior  depended  on  the 

Initial  source  conditions.  The  AFESC  supported  an  adaptation  of  the  CHARM 

model  to  cases  where  dense  gas  slumping  nay  occur.  RADIAN  (Reference  35) 

describes  how  the  CHARM  model  was  updated  to  Include  NgO^  and  Aerozine-50  and 

Balentlne  and  Eltgroth  (Reference  18)  evaluate  the  revised  CHARM  model  with 

N204  and  LNG  field  experiments.  They  compare  the  CHARM  predictions  with  the 

predictions  of  three  Gaussian  models,  finding  that  the  Gaussian  model  o^’ s  are 

a  factor  of  ten  times  the  CHARM  <r  *s,  and  that  all  Gaussian  model 

z 

concentration  predictions  are  below  the  observations. 


Independently,  Key  and  Bowman  (Reference  8)  reported  on 
Hypergollc  Accidental  Release  Model  (HARM)  for  application  to  accidental 
spills  at  Titan  II  sites.  A  rocket  exhaust  diffusion  model  developed  by  the 
H. E.  Cramer  Co.  was  used  as  a  basis  for  HARM,  which  includes  special 
provisions  for  hypergollc  reactions  and  the  resulting  rise  of  a  buoyant  puff 
into  the  atmosphere: 
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Figure  3.  Predictions  of  the  AFTOX  Model  Plotted  versus  Observations 
(fros  Reference  2). 


where  y  **0.4  (entrainment  constant) 

s  »  (g/T)  (de/dz)  (stability  parameter) 

Fj  ■  3gH/(4aCpPT)  (heat  flux) 

r  “  initial  radius 
o 

2 

The  parameter  g  is  the  acceleration  of  gravity  (9.8  m/s  ),  T  is  ambient 
temperature  (°K),  8  is  potential  temperature  (°K),  z  is  height  (m),  C  is 

4  «  P 

specific  heat  of  air  at  constant  pressure  (J  gm  ioK  l),  and  p  is  air  density 

-3 

(gm  a  ).  The  parameter  H  is  the  instantaneous  heat  release  in  joules. 

The  instantaneous  buoyancy  flux,  Fj,  is  produced  by  the  release  of  heat 
occurring  during  the  reaction  of  the  rocket  propellants. 

6.  N204  Field  tests 

Because  little  information  was  available  on  the  dispersion  of 
the  liquid  propellant  in  the  atmosphere,  the  AFESC  sponsored  a  series  of 

six  field  tests  at  the  DOE  facility  near  Las  Vegas.  Known  as  the  Eagle 

3 

experiments,  the  six  field  tests  Involved  spills  of  three  to  five  m  of  N„0„. 

2  4 

McRae  (Reference  IS)  describes  the  field  data  and  presents  the  results  of 
evaluations  of  the  OB/DC,  Shell,  CHARM,  and  Gaussian  models.  He  suggests  that 
a  major  problem  is  the  correct  estimation  of  the  evaporative  source  term, 
since  the  heat  balance  of  the  spill  depends  on  the  evaporative  cooling  term. 

Of  the  models  tested,  only  the  CHARM  model  could  properly 
simulate  the  dense  gas  slumping.  Because  the  other  three  models  do  not  have 
dense  gas  components,  they  overestimate  v  by  an  order  of  magnitude  and 
underpredict  the  concentrations  by  a  factor  of  two  to  ten.  However,  the  CHARM 
model  is  found  to  "switch  over"  too  soon  to  a  neutral  buoyancy  algorithm,  for 
the  observed  cloud  exhibited  slumping  to  relatively  large  distances.  McRae 
(Reference  IS)  points  out  that  it  is  difficult  to  compare  predictions  with 
observations,  since  the  required  averaging  times  are  not  usually  matched. 

7.  DEGADIS  Modification  for 

The  DEGADIS  model  was  developed  by  the  U.S.  Coast  Queu'd  for 
application  to  simulating  dense  gas  sources  such  as  LNG  and  LPG.  The  model 
has  no  source  algorithm  and  treats  only  surface-level  releases.  The  U.S.  Air 
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Force  sponsored  Em  evaluation  of  the  DEGADIS  model  with  their  Eagle  field 

data  (Reference  26).  The  DEGADIS  model,  a  Gaussian  model,  and  the  OB/DG  model 

were  compared  with  the  field  data,  with  the  results  shown  in  Table  2.  The 

DEGADIS  concentration  predictions  are  clearly  in  the  range  of  the 

observations,  while  the  other  model  predictions  are  an  order  of  magnitude  low. 

However,  the  source  term  had  to  be  adjusted  to  account  for  the  Interaction  of 

No0.  with  moisture  in  the  air. 

2  4 

Further  comparisons  of  the  DEGADIS  model  with  dense  gas  data  from 
Thorney  Island,  Maplln  Sands,  Burro,  and  Welker  experiments  are  reported  in 
Reference  7.  The  model  predicts  the  average  concentration  well,  but  sometimes 
underpredicts  the  absolute  maximum  concentration  by  a  factor  of  two  to  five. 

The  AFESC  then  supported  an  extension  of  DEGADIS  to  aerosol  relesmes 
(l.e. ,  two-phase  Jets).  The  revised  model  was  tested  with  anhydrous  ammonia 
data  (the  so-called  Desert  Tortoise  experiments)  by  Spicer,  Havens  and  Key 
(Reference  36).  Figure  4  shows  that  the  DEGADIS  predictions  are  within  a 
factor  of  two  of  the  data  points  at  distances  of  500  and  3000  meters,  but  are 
much  too  high  at  100  meters.  Because  there  Erne  only  two  data  points,  the 
significance  of  these  conclusions  is  very  low. 

8.  Model  Sensitivity  Studies 

During  1986  smd  1987,  Professor  Carney  of  Florida  State 

University  prepared  several  papers  for  the  AFESC  on  the  sensitivity  of  the 

AFTOX,  CHARM,  Emd  PUFF  models  to  uncertainties  in  input  data  (Reference  5). 

His  1987  paper  applied  the  uncertainty  formula  suggested  by  Freeman  et  al. 

(Reference  37),  which  has  also  been  applied  by  Hanna  (Reference  38)  to  a 

simplified  air  quality  model.  If  concentration,  C,  is  an  analytical  function 

2 

of  the  variables  Xj  (l  ■  1  to  n),  then  the  uncertainty  or  variance  Vc  *  <rc  is 
given  by  the  equation 
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TABLE  2.  COMPARISON  OF  EAGLE  3  AND  EAGLE  6  TEST  RESULTS  AND  GAS  DISPERSION 


MODEL  PREDICTIONS  (REFERENCE  6). 


Maximum 

NO,  Concentration  cry 

ramge  (ppm)*  (m) 

<rz 

(m) 

EAGLE  3 

Test  Results 

500-1040 

35 

3.8 

OB/DG 

68-73 

— 

— 

Gaussian  Plume 

68-73 

60.5 

31.9 

DEGADIS 

880-1170 

57. 6-60. 7* • 

2.3-2. 9*»* 

EAGLE  6 

Test  Results 

160-340 

35 

7.6 

OB/DG 

20-21 

— 

— 

Gaussian  Plume 

25-27 

60.5 

31.9 

DEGADIS 

190-220 

55. 6-56. 6** 

4. 5-4. 9*** 

•The  concentration  range  for  the  model  predictions  are  for  the  estimated 
source  evolution  rate  range. 

••calculated  as  S 

y 

•••calculated  as  S  /V5 
z 
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Figure  4.  Maxima  Observed  Concentration  and  Maximum  Predicted 

Concentrations  Using  DEGADIS  and  the  Pasqui 11 -Hanna  Gaussian 
Plume  Model  for  Desert  Tortoise  4  (Reference  20). 
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where  is  the  uncertainty  or  variance  in  input  variable  x^  This  equation 
is  a  Taylor  expansion  and  Implicitly  assumes  that  the  individual  uncertainties 
are  much  less  than  one.  Carney  (Reference  5)  finds  that  the  wind  speed,  u, 
contributes  the  most  uncertainty  to  the  concentration,  C,  predicted  by  the 
AFTOX  model. 


9.  New  Air  Force  Model  Development:  Raj  et  al.  (Reference  39)  and 
Ermak  et  al.  (Reference  40) 

Two  major  new  model  development  efforts  are  sponsored  by  the  Air 
Force.  The  Air  Force  Geophysics  Laboratory  (AFGL)  and  AFESC  are  supporting 
the  development  of  an  Improved  dense  gas  model  (ADAM:  Air  Force  Dispersion 
Assessment  Model)  that  accounts  for  aerosols  and  Jets  (Reference  39)  and  can 
be  Installed  at  bases  where  more  serious  hazardous  spills  may  occur.  In 
addition,  AFESC  is  supporting  a  modification  of  the  SLAB  model  (Reference  40) 
to  make  it  more  user-friendly  and  to  allow  It  to  handle  transient  releases. 
These  new  models  are  expected  to  be  delivered  in  1988. 

10.  Summary  of  Field  Data 

Ermak  et  al.  (Reference  40)  has  put  together  a  comprehensive 
summary  of  26  "bench  mark"  field  experiments,  Including  data  from  Burro  (LNG) , 
Coyote  (LNG),  Eagle  (NgO^),  Desert  Tortoise  (NH^),  Maplin  Sands  (LNG  and  LPG) 
and  Thorney  Island  (Freon).  This  study  (funded  by  AFESC)  presents  input  data 
required  by  models  and  Includes  observed  peak  concentrations,  average 
centerline  concentrations,  and  average  height  and  width  of  the  cloud  as  a 
function  of  downwind  distance.  Presumably  these  data  are  sufficiently 
complete  for  anyone  to  run  and  evaluate  his  model. 

11.  Heavy  Vapor  Model  Comparisons 

Mercer  (Reference  41)  describes  a  model  comparison  exercise 
underway  In  Europe.  Several  organizations  are  running  their  heavy  gas  models 
(e.g. ,  DENZ,  DEGADIS,  DRIFT)  on  the  sane  data  set.  Table  3  contains  the 
matrix  of  Input  data  (2S  different  cases).  Models  will  be  run  with  a  given 
source  emission  rate.  Unfortunately,  there  are  no  baseline  data  with  which  to 
compare  the  model  predictions. 
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TABLE  3.  DATA  FOR  MERCER  (REFERENCE  41)  MODEL  COMPARISON  EXERCISE 

MATRIX  OF  RUNS 


Wind  Speed  at  10m 

Stability 

— 

1 

2 

4 

8 

D 

F 

D 

D 

D 

Volume 

Radius 

Roughness 

/  3 » 

(m  ) 

(m) 

length  (m) 

0.01 

X 

X 

X 

X 

X 

2000 

7 

0.3 

X 

X 

X 

X 

X 

24 

0.01 

X 

X 

X 

X 

X 

0.05 

X 

X 

X 

X 

X 

2  •  5  x  105 

120 

1.5 

X 

X 

X 

X 

X 

Total  25  Cases 


It  Is  planned  that  ADAM  (Reference  39)  be  Included  in  this  model 

comparison. 

12.  A  Methodology  for  Evaluating  Heavy  Gas  Dispersion  Models 

In  a  recent  draft  report  prepared  for  AFESC,  Ermak  and  Merry 
(Reference  6)  review  methods  for  evaluating  heavy  gas  dispersion  models.  They 
first  list  several  specific  criteria  of  interest  to  the  Air  Force: 

o  The  methodology  is  to  be  based  on  comparison  of  model 
predictions  with  field-scale  experimental  observations, 
o  The  methods  of  comparison  must  be  quantitative  and 
statistical  in  nature. 

o  The  methods  must  help  Identify  limitations  of  the  models 
and  levels  of  confidence. 

o  The  methodology  must  be  compatible  with  atmospheric 
dispersion  models  of  interest  to  the  Air  Force. 
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Because  these  criteria  are  similar  to  thos  stated  for  our  present  study,  the 
results  presented  by  Ermak  and  Merry  are  of  great  use  to  our  research.  Their 
report  was  received  after  our  Phase  I  work  was  completed,  and  only  a  brief 
review  was  possible.  A  more  extensive  study  of  their  results  will  take  place 
in  Phase  II. 

The  Ermak  and  Merry  (Reference  6)  report  is  a  review  of  general 
evaluation  methods  and  heavy  gas  model  data  sets,  and  does  not  contain 
examples  of  applications  of  any  new  evaluation  methods  with  field  data  sets. 
Presumably  these  applications  will  take  place  in  a  later  phase  of  their  work. 
They  first  review  the  general  philosophy  of  model  evaluation,  pointing  out 
that  sometimes  evaluations  of  model  physics  are  just  as  important  as 
quantitative  statistic  evaluations.  Much  of  their  philosophical  discussion 
follows  the  points  inside  in  a  review  paper  by  Venkatram  (Reference  42).  For 
example,  a  model  whose  predictions  agree  with  field  data  but  which  contains  an 
irrational  physical  assumption  (e.g.  dense  gas  plumes  accelerate  upward)  is 
not  a  good  model.  Also,  they  recognize  that  most  model  predictions  represent 
ensemble  averages,  whereas  field  experiments  represent  only  a  single 
realization  of  the  countless  data  that  make  up  an  ensemble.  They  emphasize 
that  observed  concentrations  are  strong  functions  of  averaging  time,  and  that 
most  heavy  gas  dispersion  models  do  not  Include  the  effects  of  averaging  time. 

Heavy  gas  dispersion  models  are  distinguished  from  other  dispersion 
models  by  three  effects:  reduced  turbulent  mixing,  gravity  spreading,  and 
lingering.  The  main  parameters  of  interest  in  evaluations  of  these  models 
are  the  maximum  concentration,  the  average  concentration  over  the  cloud,  and 
the  cloud  width  and  height  (all  as  a  function  of  downwind  distance,  x). 

Ermak  and  Merry  emphasize  the  ratio  of  predicted  to  observed  variables  and 
define  several  statistics,  such  as  the  mean  and  the  variance.  Methods  of 
estimating  confidence  limits  on  these  statistics  are  suggested,  and  the  report 
closes  with  an  example  of  the  application  of  some  of  their  suggested 
procedures  to  a  concorted  data  set  draws  from  a  Gaussian  distribution. 
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B.  REVIEW  OF  GENERAL  HAZARD  MODEL  EVALUATION/UNCERTAINTY  STUDIES 


Section  II. A  covered  specific  U. S.  Air  Force  hazard  model  development  and 
evaluation  experience.  The  present  section  Is  intended  to  cover  more  general 
(that  is,  non-USAF)  hazard  model  evaluation/uncertainty  studies.  Hanna  and 
Drivas  (Reference  1)  review  several  such  studies,  and  Table  1  (presented 
earlier)  contains  a  list  of  13  references  on  this  subject.  Most  papers  and 
reports  on  hazard  model  evaluation  deal  only  with  one  particular  model  or  data 
set.  Exceptions  are  papers  by  Mercer  (Reference  43),  McNaughton  et  al. 
(References  44  and  45)  and  Layland  et  al.  (Reference  10),  which  are  reviewed 
first. 

1.  Comprehensive  Model  Evaluation  Studies 

Mercer’s  (Reference  43)  review  emphasizes  estimation  of  variability 
or  uncertainty  in  model  predictions,  which  he  finds  is  typically  an  order  of 
magnitude  when  outliers  are  considered.  He  includes  the  following  quote  from 
Lamb  (Reference  46),  which  is  also  appropriate  for  our  discussion. 

"The  predictions  even  of  a  perfect  model  cannot  be  expected  to  agree  with 
observations  at  all  locations.  Consequently,  the  common  goal  of  model 
validation  should  be  one  of  determining  whether  observed  concentrations  fall 
within  the  interval  indicated  by  the  model  with  the  frequency  indicated,  and 
if  not,  whether  the  failure  is  attributable  to  sampling  fluctuations  or  is  due 
to  the  failure  of  the  hypotheses  on  which  the  model  is  based.  From  the 
standpoint  of  regulatory  needs  the  utility  of  a  model  Is  measured  partly  by 
the  width  of  the  interval  in  which  a  majority  of  observations  can  be  expected 
to  fall.  If  the  width  of  the  Interval  is  very  large,  the  model  may  provide  no 
more  information  than  one  could  gather  simply  by  guessing  the  expected 
concentration.  In  particular,  when  the  width  of  the  interval  of  probable 
concentration  values  exceeds  the  allowable  error  bounds  on  the  model’s 
predictions,  the  model  is  of  no  value  in  that  particular  application." 

Mercer  (Reference  43)  then  produces  concentration  predictions  of  ten 
different  models  for  a  dense  gas  source  equivalent  to  that  used  in  the  Thorney 
Island  experiments.  This  comparison  (Figure  5)  shows  that  the  10  model 
predictions  range  over  an  order  of  magnitude  at  any  given  downwind  distance. 
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Figure  S.  Model  Prediction*  for  Thorney  Island  Trials  (Reference  43) 


The  Chemical  Manufacturers*  Association  (CMA)  sponsored  an  evaluation 
of  eight  dense  gas  dispersion  models  and  nine  spill  evaporation  models 
(References  44  and  45).  The  authors  ran  some  of  the  models  themselves  and 
requested  the  developers  of  proprietary  models  to  run  their  own  models  using 
standard  Input  data  sets.  Model  uncertainty  is  typically  a  factor  of  two  to 
five.  The  comparisons  are  clouded  by  the  use  of  some  data  sets  that  had 
already  been  used  to  "tune1*  certain  of  the  models  tested. 

Some  of  the  same  authors  were  Involved  in  a  similar  study  performed 
for  the  EPA  (Reference  10),  In  which  the  DEGADIS,  OME,  and  INPUFF  models  were 
compared  with  some  Thorney  Island  observations.  The  DEGADIS  and  OME  models 
account  for  dense  gas  slumping.  Figure  6  contains  the  results  of  these 
comparisons,  which  do  not  give  one  great  confidence  in  hazard  model 
predictions.  Some  of  this  error  may  be  caused  by  uncertainties  in  input  data 
specification. 

2.  Model  Comparisons  with  Thorney  Island  Data 

The  Thorney  Island  experiment  was  carried  out  to  test  the  dense  gas 
slumping  component  of  hazard  response  models.  The  experiment  is  reviewed  in 
more  detail  in  Section  II. D.  but  can  be  briefly  described  as  an  instantaneous 

3 

ground-level  release  over  a  flat  surface  of  about  1000  m  of  freon  gas  in  the 
shape  of  a  cylinder.  The  data  were  made  available  quickly  and  completely  to 
the  modeling  community  and  have  been  used  for  many  model  comparisons,  some  of 
which  are  discussed  below. 

Puttock  and  Colenbrander  (Reference  23)  discuss  the  random  nature  of 
the  data.  They  state  "it  is  desirable  to  know  exactly  what  any  model  is 
Intended  to  predict  (usually  some  sort  of  average),  and,  in  a  final 
assessment,  how  much  one  experiment  might  be  expected  to  deviate  from  the 
average."  This  quote  emphasizes  the  fact  that  a  model  prediction  represents 
an  ensemble  average,  which  would  be  given  in  the  atmosphere  by  an  average  over 
100  or  more  experiments  conducted  with  the  same  external  conditions.  They 
present  Figure  7,  which  demonstrates  the  uncertainty  in  observed 
concentrations  in  a  single  Thorney  Island  experiment.  Data  from  three 
different  sensor  heights  are  plotted. 
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Figure  6.  Model  Comparisons  Published  in  Reference  10. 
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Figure  6.  Model  Comparisons  Published  in  Reference  10  (Concluded) 


Puttock  (Reference  47)  later  compares  the  predictions  of  his 
HEGABOX/HECADAS  model  with  observations  during  13  Thorney  Island  trials. 

Table  4  presents  the  observed  and  predicted  distances  where  peak 
concentrations  equal  5  percent,  2  1/2  percent  and  1  percent.  Generally,  the 
observations  and  predictions  are  within  +  50  percent.  He  finds  that  the 
roughness  length  used  in  the  model  should  be  based  on  upwind  terrain  and 
therefore  on  wind  direction. 

The  uncertainties  in  models  were  also  emphasized  by  Wheatley  et  al. 
(Reference  48),  who  used  the  Thorney  Island  data  to  derive  the  top  entrainment 
constant  necessary  to  make  models  such  as  DENZ  agree  with  the  observations. 
Figure  0  presents  the  range  of  derived  top  entrainment  constants  for  13 
Thorney  Island  trials.  The  symbols  S ^  and  Sg  refer  to  goodness-of-f it 
measures  employed  in  the  maximum  likelihood  procedure.  The  range  is  typically 
an  order  of  magnitude. 

The  American  Petroleum  Institute  (API)  supported  a  comparison  by  EAI 
(Reference  49)  in  which  the  Eldsvlk,  MARIAH  II,  HEGADAS  II,  and  Cox  and 
Carpenter  models  were  evaluated  with  the  Thorney  Island  data.  These  four 
models  were  chosen  because  they  represent  the  three  model  groups: 

Box  models:  Eldsvlk;  Cox  and  Carpenter 
K- theory  model:  HEGADAS- I I 
Hydrodynamic  3D  model:  MARI AH- I I 

This  report  is  very  useful  because  it  contains  complete  sets  of  tables  and 
figures  that  can  be  used  by  other  researchers  for  further  analysis.  It  is 
concluded  that  the  Eldsvlk  and  Cox  and  Carpenter  models  are  conservative  in 
the  sense  that  they  overpredict  by  a  factor  of  about  two,  while  the  HEGADAS 
and  MARIAH  models  are  closer  to  the  observations.  Consequently,  the  first  two 
models  are  recommended  for  screening  analyses,  while  the  last  two  models  are 
recommended  for  more  refined  calculations.  An  example  of  a  comparison  of  the 
four  models  with  data  from  Trial  15  is  presented  in  Figure  9.  The  tendency 
of  the  Eldsvlk  and  Cox  and  Carpenter  models  to  overpredict  can  be  clearly 
seen. 
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TABLE  4.  THORNEY  ISLAND  OBSERVATIONS  AND  HEGABOX/HEGADAS  MODEL  PREDICTIONS 
(REFERENCE  47). 
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Figure  8.  Entraineent  Hates  Derived  from  the  Thorney  Island  Data 
(Reference  48). 
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Test  15s  Tine  vs  Hex  Concentration 


Predictions  of  Four  Models  Compared  with  Thorney  Island  Test 
15  Data  (Reference  49). 
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Our  final  example  of  a  model  evaluation  using  the  Thorney  Island  data 
is  the  paper  by  Chikhllwala  et  al.  (Reference  50).  The  so-called  SAFER  model, 
whose  dense  gas  slumping  algorithm  is  based  on  the  Kaiser  and  Walker  model,  is 
compared  with  data  from  six  Thorney  Island  trials.  The  agreement  appears  to 
be  good;  however,  this  may  not  be  an  Independent  comparison,  since  these  same 
data  were  used  in  the  development  of  the  model. 

3.  Model  Comparisons  with  Aerosol  Data 

The  Thorney  Island  source  was  highly  simplified  so  that  the  data 
could  be  used  to  test  the  dense  gas  algorithms  in  models.  Many  real-world 
sources  consist  of  two  phase  Jet  releases,  such  as  the  release  of  HF,  or 

NHg  from  a  hole  in  a  pressurized  tank.  Here  the  major  problem  is  to  estimate 
what  fraction  of  the  release  consists  of  gas  and  what  fraction  consists  of 
liquid.  The  situation  is  further  complicated  by  the  fact  that  part  of  the 
liquid  may  be  spilled  onto  the  ground  (and  subsequently  evaporated)  and  part 
may  be  broken  up  into  an  aerosol  that  drifts  off  downwind.  Thus,  a  release  of 
NH^,  with  a  molecular  weight  less  than  that  of  air,  can  act  like  a  dense  plume 
because  of  the  liquid  drops  within  the  plume. 

Blewltt  et  al.  (Reference  51)  discuss  applications  of  the  SLAB  and 
DECADIS  models  to  data  collected  from  series  of  HF  tests.  They  found  that  the 
source  term  must  be  adjusted  so  that  all  of  the  HF  is  assumed  to  drift  off 
with  the  gas  plume  (that  is,  no  liquid  spill  occurs  on  the  surface).  With 
this  adjustment,  the  SLAB  model  predictions  were  within  a  factor  of  two  of  the 
observations.  Analysis  of  the  DECADIS  model  predictions  was  deferred  until 
questions  regarding  the  appropriate  averaging  time  for  the  model  are  resolved. 

Large-scale  spill  tests  of  NILj  and  N204  at  the  Nevada  Test  Site  are 
discussed  by  Koopman  et  al.  (Reference  52).  These  tests  were  named  the  Desert 
Tortoise  and  Eagle  experiments  (referred  to  earlier),  respectively.  Aerosols 
were  found  to  play  a  very  Important  role  in  dense  gas  dispersion.  A  simple 
Gaussian  plume  model,  the  empirical  OB/DG  model,  and  the  three-dimensional 
hydrodynamic  model  FEM3  were  applied  to  the  data.  The  Gaussian  and  OB/DG 
models  were  inadequate,  while  the  FEM3  model  gave  reasonable  predictions  once 
the  source  term  was  adjusted  to  account  for  the  presence  of  aerosols. 


C.  REVIEW  OF  AVAILABLE  HAZARD  RESPONSE  MODELS 


This  section  is  divided  into  a  relatively  short  comprehensive  review  of 
available  hazard  response  models  and  a  more  complete  review  of  available  Air 
Force  models. 

1.  Comprehensive  Review  of  Models 

As  part  of  a  review  for  the  AIChE,  Hanna  and  Dr lvas  (Reference  1) 
distributed  questionnaires  to  all  known  hazard  response  modelers.  This  list 
Included  developers  of  publically-avallable  models  as  well  as  proprietary 
models.  Because  of  the  absence  of  a  recommended  government  model,  a 
flourishing  business  in  proprietary  hazard  response  models  has  grown  up  over 
the  past  few  years.  A  total  of  32  completed  questionnaires  were  returned,  and 
the  results  are  tabulated  in  Table  5.  Most  of  the  entries  in  the  table  are 
“Yes"  or  "No"  answers  to  questions  regarding  whether  or  not  the  model  Includes 
a  certain  component.  Both  the  source  emission  component  and  the  transport  and 
dispersion  component  are  Included.  This  table  represents  the  status  of  models 
as  of  about  January  1987.  Since  that  time  new  models  have  appeared  and  some 
of  the  older  models  have  been  modified. 

Many  people  are  working  on  specific  subcomponents  to  models.  For 
example,  papers  on  various  aspects  of  dense  gas  dispersion  appear  routinely  in 
the  scientific  Journals,  but  in  most  cases  these  models  do  not  find  their  way 
into  Table  5  because  they  are  not  embodied  in  a  specific  user-friendly  piece 
of  software. 

2.  Review  of  Air  Force  and  other  Publicly  -  Available  Models 

At  the  start  of  this  project  an  attempt  was  made  to  collect 
publlcly-avallable  hazard  response  models  and  to  install  and  test  them  on 
microcomputers.  Because  of  the  requirement  that  the  models  be  Installed  on 
microcomputers,  some  models  such  as  FEM3  and  DEGADIS  were  not  included  (Note: 
the  API  Is  funding  an  effort  to  make  a  PC-compatible  version  of  DEGADIS).  At 
the  other  extreme,  some  models  such  as  OB/DG  consist  of  only  a  single  equation 
and  therefore  it  Is  not  necessary  to  install  it  on  any  computer. 

Nevertheless,  such  models  are  Included  In  our  collection.  Table  6  contains 

(The  reverse  of  this  page  is  blank) 
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TABLE  5.  RESULTS  FROM  MODEL  QUESTIONNAIRES  (MOST  ANSWERS  ARE  GIVEN  AS  Y  *  YES  OR  N  =  NO)  AS  OF  DECEMBER  1986 
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information  on  11  models  currently  on  our  microcomputers: 


AFTOX*,  AVACTI  II.  CHARM*.  D2PC.  MADICT,  OB/DG*,  OME. 

PUFF /INPUFF,  RVD,  SLAB*,  and  SPILLS. 

The  asterisks  indicate  models  that  were  developed  either  wholly  or  partially 
under  the  support  of  the  U. S.  Air  Force. 

The  CHARM  model  is  the  most  comprehensive  of  those  on  the  list,  but 
is  a  proprietary  model  which,  because  of  partial  support  by  the  AFESC,  is 
given  to  U. S.  Air  Force  -  affiliated  users.  The  other  models  all  have  certain 
limitations  that  make  them  less  comprehensive  (or  desirable)  than  several  of 
the  proprietary  models  in  Table  5  (such  as  SAFER,  HASTE,  and  MIDAS  or 
CARE).  The  general  characteristics  and  limitations  of  the  models  are  briefly 
listed  below. 

AFTOX.  This  model  was  developed  by  AFGL  (Reference  2)  as  a 
replacement  for  the  OB/DG  model.  The  AFTOX  model,  described  in  Section  II-A, 
is  a  Gaussian  model  that  does  not  treat  dense  gases.  It  will  handle 
Instantaneous  or  continuous  evaporative  emissions  from  spills,  but  will  not 
handle  Jet  releases.  It  has  been  extensively  compared  with  field  tests  of 
continuous,  neutrally  buoyant  emissions  from  the  Prairie  Grass,  Green  Glow, 

Dry  Gulch,  and  Ocean  Breeze  sites. 

AVACTA  II.  This  is  a  puff  model  developed  by  Zannetti  (Reference  53) 
that  will  account  for  variable  wind  fields  in  complex  terrain.  It  is  suitable 
for  Instantaneous,  transient,  or  continuous  releases.  However,  it  cannot  be 
used  for  dense  gases. 

CHARM.  As  stated  above,  the  CHARM  model  (Reference  35)  is  the  most 
comprehensive  (by  far)  of  the  group  of  11  models,  but  is  not  fully  available. 

D2PC.  This  is  a  U. S.  Army  model  used  for  calculating  the  dispersion 
of  specific  munitions  (Reference  9).  The  physical  components  of  the  model  are 
excellent,  but  the  user  cannot  get  Into  the  code  to  make  use  of  these 
components.  The  user-friendly  system  Is  highly  simplified  and  deals  only  with 
special  munitions. 
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MADICT.  This  model  (Reference  54)  is  very  similar  to  AVACTA-II  in 
that  it  is  a  puff  model  that  accounts  for  variable  wind  fields  in  complex 
terrain.  Dense  gases  are  not  treated. 

QB/DG.  The  OB/DG  aodel  is  a  one-line  empirical  correlation  developed 
by  Nou  (Reference  27)  for  application  to  Air  Force  sites  where  liquid 
propellants  are  stored.  It  is  good  only  for  the  range  of  conditions  used  in 
its  derivation: 

Daytime  stabilities 

Ground-level  continuous  neutrally- buoyant  source 

Downwind  distances  less  than  about  5  km. 

OME.  This  is  a  highly-simplified  screening  model  used  by  the  Ontario 
Ministry  of  the  Environment  (Reference  55)  for  application  to  a  wide  variety 
of  hazardous  gas  releases.  It  can  be  used  for  dense  gas  Jets  or  spills,  but 
has  been  shown  to  have  questionable  performance  when  compared  with  field  data. 

PUFF/INPUFF.  This  EPA  aodel  (Reference  56)  applies  to  puff  transport 
and  dispersion  of  non  dense  gases.  It  is  similar  to  AVACTA-II  and  MADICT,  but 
does  not  handle  spatially  variable  wind  fields  in  complex  terrain. 

RVD.  The  Relief  Value  Discharge  (RVD)  aodel  of  the  EPA  applies  only 
to  elevated  continuous  releases  of  dense  gases,  and  is  based  on  the 
Hoot-Meroney-Peterka  aodel.  The  primary  output  is  the  concentration  at  the 
point  the  plume  touches  the  ground.  It  includes  an  arbitrary  conservative 
factor  of  five. 


SLAB.  The  SLAB  model  is  being  modified  by  Ermak  et  al.  (Reference 
13)  under  support  of  the  API  and  the  AFESC  so  that  it  can  operate  on  a  PC  and 
so  that  it  can  handle  transient  releases.  It  employs  an  evaporative  source 
term  and  can  account  for  dense  gas  slumping.  The  modifications  are  not  yet 
complete. 


SPILLS.  The  SPILLS  model  (Reference  56)  does  what  it  says  and  models 
evaporation  from  hazardous  spills.  However,  it  employs  a  simple  Gaussian 
dispersion  algorithm  to  calculate  downwind  dispersion  that  does  not  include 
the  effects  of  dense  gas  slumping.  It  was  used  as  a  basis  for  the  development 
of  the  AFTOX  model. 
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Note:  Raj  et  al.  (Reference  38)  are  developing  the  ADAM  model  for  AFGL  that 
will  account  for  two  phase  Jet  releases  and  dense  gas  slumping^  It  will  be 
released  in  1988. 

D.  REVIEW  OF  DATA  SETS 

Many  field  experiments  have  been  conducted  for  the  purpose  of  evaluating 
dispersion  models.  Draxler  (Reference  58)  reviews  those  carried  out  with 
positively  or  neutrally  buoyant  sources.  Table  7  summarizes  several  of  these 
experiments  that  utilized  neutrally  buoyant  tracers.  Hanna  and  Drlvas 
(Reference  1)  review  those  carried  out  with  negatively  buoyant  sources,  and 
Table  8  contains  a  summary  from  that  document.  The  following  subsections 
provide  a  review  of  specific  data  sets  used  later  for  evaluation  of  models  or 
planned  for  model  evaluation  in  Phase  II  of  this  project.  The  first  section 
is  a  review  of  tests  that  used  negatively  buoyant  tracers;  the  second  section 
reviews  neutrally  buoyant  tests.  More  detailed  descriptions  of  the  Prairie 
Gross,  Green  Glow,  Dry  Gulch,  Ocean  Breeze,  and  Thorney  Island  experiments  are 
given  in  Appendix  A. 

1.  Negatively-Buoyant  Tracer  Tests 

a.  ESSO/API,  Gaz  de  France,  DGA  Netherlands 

These  tests,  conducted  in  the  early  1970s  were  of  limited 
scope  and  applicability.  All  were  on  a  much  smaller  scale  than  more  recent 
experiments  (Reference  59).  ESSO/API  and  the  Gaz  de  France  used  LNG  released 
onto  sea  and  soil  surfaces,  respectively.  The  DGA  Netherlands  tests  used 
Freon-12  as  the  tracer  and  released  it  onto  sand.  The  data  compiled  from 
these  tests  were  not  nearly  as  comprehensive  as  those  from  more  recent  tests. 
As  these  data  are  of  limited  use  in  model  development  or  evaluation,  they  have 
not  been  compiled  and  put  into  the  SRC  database. 

b.  Porton  Down 

Forty-two  releases  of  Freon-12,  approximately  40  cubic 
meters  in  volume,  were  conducted  at  Porton  Down,  England,  in  1978.  The  tests 
encompassed  a  range  of  initial  gas  cloud  densities,  wind  speeds,  and  varying 


TABLE  7.  EXAMPLES  OF  NEUTRAL  TRACER  FIELD  EXPERIMENTS. 


Test  and 
Reference 

Tracer 

No.  of 
Tests 

Release 

Tvoe 

Release 

Height 

Averaging 

Time 

Prairie  Crass 
Barad  1958 

SOa 

70 

C 

Ground 

10  aln 

Green  Glow 

Barad  &  Fuquay 
1962 

FP 

28 

C 

Ground 

30  aln 

Ocean  Breeze 
Haugen  &  Fuquay 
1963 

FP 

78 

C 

Ground 

30  aln 

Dry  Gulch 

Haugen  &  Fuquay 
1963 

FP 

109 

c 

Ground 

30  aln 

Sand  Stora 

Taylor  196S 

Be 

43 

QI 

Ground 

Cabauw 

Agterberg  at  al. 
1983 

SFs 

30 

C 

Elevated 

30  aln 

Legend 

SOa  Sulfur  Dioxide 

FP  Fluorescent  Particles 

Be  Beryl llua 

SFs  Sulfur  Hexafluoride 

C  Continuous 

QI  Quasi -Instantaneous 


Source 

Strength. 

40.2-104. 1 
g/sec 

.85-7.04 
kg/30  aln 


.55-3.31 
kg/30  aln 


1.  1-3.3 
kg/30  aln 


Variable 


1.06-4.61 

g/sec 


TABLE  8 


EXAMPLES  OF  DENSE  GAS  FIELD  EXPERIMENTS  (REFERENCE  1) 


Site  and  Reference 

Material 

No.  of 

Tests 

3  Q‘ 

(mJ  liquid)  (aJ 

Q 

min  *llq) 

Surface 

DGA  Netherlands 
van  Ulden  1974 

Freon  12 

2 

1000  kg 

- 

sand 

Gaz  de  France 

Humbert  and  Montet 
1972 

LNG 

40 

— 

0.  16 

sol  1 

ESSO/API 

Feldbaner  et  al. 

1972 

LNG 

17 

.09-10.2 

sea 

HSE  Porton 

Picknett  1982 

Freon 

35 

3 

40m  gas 

- 

grass¬ 

land 

Maplln  Sands 

Puttock  et  al. 

1982 

LNG, 

propane 

34 

27 

1-5 

sand, 

sea 

China  Lake  Burro 
Koopman  et  al. 

1982 

LNG 

8 

40 

12-18 

pond 

China  Lake  Coyote 
Goldwlre  et  al. 

1983 

LNG 

15 

3-28 

6-19 

pond 

Desert  Tortoise 
Koopman  et  al. 

1984 

NH3 

4 

60  over 

7.5  min. 

sand 

Eagle 

N2°4 

6 

4.2  over 

3  min. 

sand 

Thorney  Island 
Puttock  and  Colen- 
brander  1985 

Freon 

28 

2000 

Instantaneous 

airport 

degrees  of  surface  roughness.  Concentration  measurements  were  made  by  eight 
monitors  located  on  two  masts  25  meters  downwind  and  5  degrees  to  either  side 
from  the  expected  cloud  centerline.  Wind  data  were  measured  by  10  sets  of 
anemometers  and  nine  wind  vanes,  all  of  which  were  designed  at  the  Chemical 
Defense  Establishment  at  Porton.  Temperature  and  relative  humidity  were  also 
measured  using  two  aspirated  psychrometers  located  0.5  and  4  meters  above 
ground  level.  Estimates  of  the  Pasqulll  stability  class  were  made  and  the 
roughness  length  was  determined. 

c.  Maplin  Sands 

Thirty-four  spills  of  refrigerated  liquid  propane  and  liquified 
natural  gas  were  conducted  in  an  area  of  tidal  sands  on  the  north  side  of  the 
Thames  River  estuary  in  England  during  1980  (Reference  40).  Shell  Research 
Limited  performed  the  tests  under  conditions  of  offshore  winds  for  safety 
purposes.  The  release  point  was  350  meters  offshore;  if  possible,  spills  were 
made  at  high  tide.  Gas  sensors  were  located  on  71  floating  pontoons,  usually 
at  heights  of  0.5,  1.4,  and  2.4  meters  above  the  sea  surface.  Other 
instrumentation  Included  thermocouples  and  sonic  anemometers.  Two  special 
pontoons  provided  vertical  profiles  of  wind  speed  and  temperature  up  to  10 
meters;  additionally,  wind  direction,  relative  humidity,  solar  Insolation, 
water  temperature,  and  wave  height  were  measured.  Detailed  Information  on 
several  of  these  spills  are  Included  in  the  computer  data  base  set  up  at  Sigma 
Research  Corporation. 

d.  Burro 

The  Burro  series  of  nine  tests  were  held  at  China  Lake, 
California,  in  1980.  Forty  cubic  meter  spills  of  liquified  natural  gas  of 
varying  duration  were  conducted  under  the  sponsorship  of  the  U. S.  Department 
of  Energy  and  the  Gas  Research  Institute  (Reference  40).  Gas  sensors  and 
thermocouples  were  mounted  at  three  heights  on  30  stations  located  on  four 
arcs  57,  140,  400,  and  800  meters  from  the  release  point.  Five  of  the  30 
stations  had  three  anemometers  mounted  as  well.  Wind  data  were  also  collected 
from  an  array  of  20  stations  that  each  had  an  anemometer  mounted  at  2  meters 
above  ground  level.  The  water  basin  in  which  the  LNG  was  spilled  was  58 
meters  in  diameter  and  1  meter  deep.  The  terrain  was  relatively  flat,  sloping 
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slightly  downward  from  left  to  right  as  viewed  looking  downwind  from  the 
release  point.  Detailed  information  on  four  of  these  spills  are  Included  In 
the  SRC  database. 

e.  Coyote 

The  Coyote  series  of  ten  tests  were  conducted  at  China  Lake, 
California  In  1981  (Reference  40).  Spill  volumes  of  LHC  were  fairly  small  and 
both  spill  volume  and  duration  varied  over  these  tests,  which  were  sponsored 
by  the  U. S.  Department  of  Energy  and  the  Gas  Research  Institute.  Thirty 
stations  instrumented  with  gas  sensors  and  thermocouples  were  arrayed  between 
100  and  500  meters  downwind  from  the  source.  Wind  data  was  collected  from  an 
array  of  20  stations,  each  with  an  anemometer  mounted  at  2  meters  above  ground 
level.  The  water  basin  in  which  the  LNG  was  spilled  was  58  meters  in  diameter 
and  1  meter  deep.  The  terrain  at  the  test  site  was  relatively  flat  and  sloped 
slightly  from  left  to  right  as  viewed  looking  downwind  from  the  source. 
Detailed  information  on  three  of  these  spills  are  included  in  the  SRC  data 
base. 


f.  Desert  Tortoise 

Four  releases  of  NH^  were  conducted  at  Frenchman  Flat,  Nevada, 
in  1983  by  the  Lawrence  Livermore  National  Laboratory  (Reference  40).  All 
tests  were  done  under  conditions  of  constant  pressure  and  flat  terrain  with 
both  stable  and  neutral  atmospheric  conditions.  Vapor  cloud  concentration  and 
temperature  were  measured  on  arcs  located  100,  800,  and  1400  meters  downwind. 
Wind  data  were  collected  from  cup  and  vane  anemometers  located  both  upwind  and 
downwind  of  the  source  at  2  meters  above  ground  level.  Detailed  information 
on  two  of  these  spills  are  Included  in  the  SRC  data  base. 

g.  Eagle 

Six  releases  of  nitrogen  tetroxlde  (^0^)  were  conducted  at  the 
Department  of  Energy  Nevada  test  site  in  1983  by  the  Lawrence  Livermore 
National  Laboratory  for  the  U. S.  Air  Force  (Reference  15).  Vapor  cloud 
concentration  and  temperature  data  were  measured  on  5,  10  meter  towers  located 
785  meters  downwind  of  the  source.  The  sensors  were  located  1,  3.5,  and  8.5 
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meters  above  ground  level.  Ten  second  average  values  of  wind  speed, 
direction,  and  standard  deviation  of  direction  were  determined  at  nine 
locations  from  two-axis,  cup-and-vane  anemometers  sited  at  2  meters  above 
ground  level.  Information  on  two  of  these  spills  are  included  in  the  SRC  data 
base. 

h.  Thorney  Island 

Thorney  Island,  England,  was  the  site  of  IB  unobstructed, 
large-scale  releases  of  a  heavy  gas  tracer  during  the  summer  of  1982  and  1983 
(Reference  48).  The  experiment  was  conducted  by  the  National  Maritime 
Institute  under  contract  to  the  United  Kingdom  Health  and  Safety  Executive. 
Instantaneous  releases  of  2000  cubic  meters  of  a  gas  mixture  of  Freon-12  and 
nitrogen  were  accomplished  using  an  accord ion-type  container.  Tracer 
concentrations  were  measured  using  gas  sensors  placed  at  heights  of  0.4,  2.4, 
4.4,  and  6  meters  above  ground  on  38  fixed  and  four  mobile  masts  located  in  a 
100-by  100-meter  grid.  Meteorological  measurements  Included  wind  speed  and 
direction,  turbulence,  temperature,  pressure,  relative  humidity,  and  solar 
insolation.  Pasqul 11 -Gifford  stability  classes  were  determined. 

The  test  site  was  an  abandoned  airfield  complete  with  runways. 
The  locale  was  flat  and  uniform  over  an  area  of  1  by  0.5  kilometers.  The 
surface  was  grass  interspersed  with  tarmac-runways.  The  roughness  length  was 
determined  to  be  1  centimeter.  Wind  data  were  measured  at  10  meters  above 
ground  level.  Information  on  five  of  these  tests  is  Included  in  the  SRC  data 
base. 

2.  Neutral ly-Buoyant  Tracer  Tests 
a.  Project  Prairie  Grass 

Project  Prairie  Grass,  designed  by  Air  Force  Cambridge  Research 
Center  personnel,  was  held  in  north  central  Nebraska  near  O’Neil  in  the  summer 
of  1956  (Reference  60).  SOg  was  released  continuously  over  10-minute  periods 
from  ground  level  in  the  70  trials  that  comprised  the  project.  Dosage 
measurements  were  made  on  arcs  located  at  50,  100,  200,  400,  and  800  meters 
downwind.  About  half  of  the  trials  were  conducted  during  unstable  daytime 
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conditions  and  the  rest  were  held  at  night  with  temperature  inversions 
present.  Meteorological  measurements  Included  wind  speed,  direction,  and 
fluctuations  in  direction  from  cup  anemometers  and  airfoil  type  wind  vanes. 
Micrometeorologlcal  data,  rawinsonde  data,  and  aircraft  soundings  were  also 
taken. 


The  site  was  located  on  virtually  flat  land  covered  with  natural 
prairie  grasses  (latitude  42°  29.6'N,  longitude  98°  34.3'W).  The  roughness 
length  determined  for  the  site  by  some  of  the  researchers  was  0.6  centimeters. 
Dosages  were  measured  at  a  height  of  1.5  meters  along  the  arcs  using  midget 
lmpingers.  The  meteorological  data  were  given  in  10-minute  averages. 
Information  on  all  of  these  tests  are  Included  In  the  SRC  data  base. 

b.  Project  Green  Glow 

Project  Green  Glow,  a  Joint  program  designed  by  the  Hanford 
Laboratories  of  General  Electric  and  the  Air  Force  Cambridge  Research 
Laboratories,  was  held  at  the  Hanford  reservation  in  south  central  Washington 
in  1959  (Reference  61).  Fluorescent  particles  (which  gave  off  a  green  glow 
under  ultraviolet  light)  were  released  continuously  over  30-mlnute  periods, 
using  aerosol  fog  generators  In  the  26  trials  that  made  up  the  experiment. 
Tracer  dosages  were  measured  using  membrane  filters  at  arcs  200,  800,  1600, 
3200,  12800,  and  25600  meters  downwind.  All  of  the  trials  were  held  at  night 
under  stable  conditions.  Meteorological  measurements  included  wind  speed  and 
direction,  temperature,  and  dew-point  temperature  from  a  410-foot  tower  and  a 
78-foot  mast  and  wind  data  only  from  18  Hanford  remote  stations.  Rawinsonde 
data  were  also  taken. 

The  site  was  surrounded  by  elevated  terrain  and  drainage  flows 
were  common.  The  surface  vegetation  consisted  of  desert  grasses  interspersed 
with  sagebrush  1  to  2  meters  in  height.  No  roughness  length  values  were  given 
In  the  project  report;  from  site  descriptions,  it  is  estimated  to  be  from  1  to 
3  centimeters.  Tracer  dosages  were  measured  using  a  Rankin  counter  and  a 
multiplier  phototube.  Special  arraying  procedures  were  used  on  samples  that 
had  been  contaminated  by  blowing  dust.  Information  on  all  of  these  tests  are 
available  In  the  SRC  data  base. 
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c.  Project  Ocean  Breeze 


Project  Ocean  Breeze  was  conducted  at  Cape  Canaveral,  Florida, 
by  Air  Force  and  General  Electric  personnel  during  1961  and  1962  (Reference 
29).  Fluorescent  particles  were  released  continuously  from  ground  level  for 
30-minute  periods  using  aerosol  fog  generators  in  the  76  trials  comprising  the 
project.  Tracer  dosages  were  measured  at  arcs  located  0.75,  1.5,  and  3  miles 
downwind  at  a  height  of  15  feet  above  ground  level.  Many  of  the  trials  were 
run  under  sea  breeze  conditions.  Meteorological  measurements  included  wind 
speed  and  direction  using  Belfort  devices  sited  12  feet  above  ground  and 
temperature  profiles  from  captive  wiresonde  instrumented  balloons.  Standard 
synoptic  and  rawinsonde  data  were  also  provided. 

The  test  site  was  on  the  missile  range  located  on  the 
east-coastal  Florida  coast.  The  locale  was  characterized  by  10-20  feet  tall 
rolling  sand  dunes.  In  addition,  much  of  the  diffusion  course  was  covered 
with  brushwood  and  palmetto  growth.  No  roughness  length  value  was  given  in 
the  project  report.  Tracer  dosage  samples  were  assayed  using  a  Rankin  counter 
and  a  multiplier  phototube  (as  in  Project  Green  Glow).  Information  on  all  of 
these  tests  sire  available  in  the  SRC  data  base. 

d.  Project  Dry  Gulch 

Project  Dry  Gulch  was  conducted  at  Vandenberg  Air  Force  Base, 
California,  by  Air  Force  and  General  Electric  personnel  during  1961  and  1962 
(Reference  29).  Fluorescent  particles  were  released  continuously  from  ground 
level  over  30-minute  periods  using  aerosol  fog  generators  in  the  109  trials 
that  comprised  the  project.  Tracer  dosages  were  measured  with  membrane 
filters  on  arcs  located  2301  and  5665  meters  downwind  on  diffusion  course  B 
and  853,  1500,  and  4715  meters  downwind  on  Course  D.  Many  of  the  trials  were 
run  under  sea  breeze  conditions.  Meteorological  measurements  included  wind 
data  from  Belfort  devices  placed  12  feet  above  ground  level  and  temperature 
data  from  wiresonde  devices.  Rawinsonde  data,  including  many  special 
launches,  were  also  provided. 

The  test  site  was  located  on  Burton  Mesa.  The  terrain  was  quite 
complex;  no  stretch  of  the  imagination  would  deem  it  flat.  No  roughness 


length  value  was  given  in  the  project  report,  but  it  would  undoubtedly  be  high 
due  to  the  terrain.  Tracer  dosages  were  assayed  using  Rankin  counters  and 
multiplier  phototubes.  Information  on  all  of  these  tests  are  available  in  the 
SRC  database. 


e.  Project  Sand  Storm 

Forty  three,  quasi -instantaneous  releases  of  rocket  solid 
propellant  mixed  with  metallic  beryllium  were  carried  out  at  Edwards  Air  Force 
Base,  California  (Reference  62).  All  tests  were  carried  out  under  unstable 
atmospheric  conditions.  Rocket  motor  firing  times  ranged  from  2  to  8  seconds. 
Initial  puff  diameters  of  15  to  45  meters  were  observed  visually.  Membrane 
filters  were  used  to  collect  the  tracer  along  arcs  from  100  to  2400  meters 
downwind.  After  the  first  14  trials,  the  samplers  were  consolidated  onto  six 
arcs  located  from  200  to  2400  meters  away  from  the  source. 

Meteorological  data  were  collected  on  an  instrumented,  60  meter 
tower  which  was  located  60  meters  upwind  of  the  release  point.  Wind  speed  and 
direction  as  well  as  temperature  data  were  collected  at  multiple  levels.  The 
standard  deviation  of  the  wind  direction,  <rg,  was  also  measured. 

f.  Cabauw,  Netherlands 

A  series  of  dispersion  tests  were  carried  out  at  Cabauw  in  the 
Netherlands  between  1977  and  1978  under  the  auspices  of  the  Royal  Netherlands 
Meteorological  Institute  and  the  KEMA  Laboratories.  The  complete  data  sets  of 
15  trials  are  to  be  found  in  Reference  63.  These  tests  consisted  of  elevated 
releases  (80  or  200  meters)  of  SFg  from  the  instrumented  213-m  mast. 
Concentration  measurements  were  made  on  an  arc  4  kilometers  downwind. 
Meteorological  measurements  Included  wind  speed  and  direction,  temperature, 
and  turbulence  from  the  instruments  on  the  mast  (Reference  64),  radiosonde, 
sonar,  synoptic  observations,  and  radiation  measurements. 

The  mast  is  located  in  the  flat,  central  portions  of  The 
Netherlands  (longitude  51*58'  N  and  latitude  4*56' E)  between  Schoonhoven  and 
Lopik.  The  surrounding  terrain,  while  flat  for  a  radius  of  about  20 
kilometers,  is  dotted  with  small  villages,  lines  of  tress,  river  dikes  (the 


river  Lek  Is  about  1  km  away  at  Its  closest  point),  and  meadows.  Owing  to  the 
variety  of  surface  covering  around  the  mast ,  the  surface  roughness  legnth,  zq, 
varies  with  wind  direction  and  time  of  year  from  6  to  25  cm. 

Each  of  the  15  trials  were  composed  of  two  consecutive,  30 
minute  tests;  the  corresponding  mast  data  is  presented  as  30  minute  averages. 
Data  analysis  and  comparisons  with  other  tests  can  be  found  in  References  65 
and  66.  All  of  these  data  are  contained  in  the  Sigma  Research  data  base. 
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SECTION  III 


COMPONENTS  OF  MODEL  UNCERTAINTY 

A.  OVERVIEW  OF  THREE  COMPONENTS  OF  MODEL  UNCERTAINTY 

Much  work  has  been  done  on  nodel  uncertainty  by  researchers  in  other 
areas.  Including  econoaics,  ecology,  and  health  sciences.  These  persons  must 
also  deal  with  widely  scattered  data.  Incomplete  Input  data,  non-Gausslan 
distributions,  and  wide  confidence  bounds.  The  following  discussion 
summarizes  the  general  approach  to  model  uncertainty  that  has  been  adopted  for 
this  research,  which  closely  parallels  recommendations  by  Hanna  (Reference 
38). 


If  major  decisions  are  to  be  made  regarding  the  choice  of  an  appropriate 
model  of  hazardous  gases,  evacuation  plans,  and  risk  assessments,  it  Is 
Important  to  have  the  best  possible  information  on  our  confidence  In  the 
models  that  are  used  and  the  data  that  are  being  collected.  It  may  even  be 
possible  to  build  the  confidence  Intervals  (uncertainty)  into  the 
decision-making  process.  There  are  three  components  of  total  error  or 
uncertainty  in  models  for  source  emissions,  transport,  and  dispersion  of 
hazardous  gases. 

Errors  caused  by  model  physics  assumptions 
Random  variability  (turbulence) 

Errors  generated  by  Input  data  errors 

These  components  have  not  yet  been  studied  in  any  comprehensive  way.  Our 
general  philosophy  on  model  evaluation  and  uncertainties  Is  shown  in  Figure 
10,  where  the  three  components  of  uncertainty  are  plotted  as  a  function  of  the 
number  of  parameters  in  the  model.  The  total  uncertainty  can  be  large  for 
models  with  a  large  number  of  parameters,  due  to  the  combined  effect  of  data 
Input  errors.  It  Is  desirable  to  construct  a  nodel  such  that  the  total  model 
uncertainty  on  the  figure  is  at  its  lowest  point.  It  is  seen  that  a  complex 
model  is  not  always  the  best  model  in  a  given  application.  In  many  instances 
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Uncertainty 


Number.  «,  el  Psreme ler*  In  Model 


Figure  10.  Illustration  of  Variation  of  Model  Uncertainty  Components 
with  Number  of  Parameters  in  Model. 


\ 

f 


a  Gaussian  dispersion  model  or  a  simple  scaling  relation  (e.g.  C  «  Q/ux) 
provides  the  lowest  uncertainty. 

The  uncertainties  can  be  quantified  by  defining  total  uncertainty  as  the 

mean  square  residual,  (C  -C  )  ,  and  assuming  that  C  and  C  are  given  by: 

p  o  p  o 

C  *  C  «■  C'  ♦  AC  (5) 

o  oa  o  o 

C  *  C  *  C'  *  AC  IB) 

P  P  P  P 

where  Cp  Is  the  predicted  parameter  (for  example,  concentration)  and  CQ  is  the 
observed  parameter.  Other  definitions  are  given  below: 

C  Is  the  ensemble  average  that  would  apply  to  these  external  conditions 
oa 

C'  is  the  stochastic  (random)  fluctuation  about  the  ensemble  average 
o 

AC  is  the  data  error  in  the  observation  C 
o  o 

Cp  is  the  predicted  ensemble  average 

Cp  is  the  predicted  stochastic  (random)  fluctuation  about  the  ensemble 
average 

AC  is  the  error  due  to  data  input  errors 
P 


No  current  hazardous  gas  models  predict  the  stochastic  fluctuation  C  '.  If  it 
is  assumed  that  there  is  no  correlation  among  the  components,  then  the  total 
uncertainty  or  the  sum  of  the  three  components  shown  in  Figure  10,  is 
given  by  subtracting  Equation  (5)  from  Equation  (6),  squaring  the 
difference,  and  averaging  (overbar): 
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2  2 
AC  '  +  AC* 

o  p 


Total  Model 
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Uncertainty 
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Data 
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Observations  have  shown  that  the  stochastic  uncertainty  component  II  Is 

roughly  equal  to  the  square  of  the  mean  (l.e.  c?c  ~  l)  for  small 

o 

averaging  times  (for  example,  a  few  seconds)  and  can  be  predicted  by  some 
models  (for  example,  Reference  67).  The  data  errors  component  III  is  also 
believed  to  be  of  the  same  order  as  the  square  of  the  mean.  The  model  physics 
error  component  I  can  be  estimated  by  solving  Equation  (7)  for  that  component, 
given  the  total  model  uncertainty  and  components  II  and  III.  Of  course,  this 
procedure  is  highly  uncertain  if  the  components  II  and  III  are  approximately 
equal  to  the  total  model  uncertainty. 

B.  DATA  INPUT  UNCERTAINTIES 


The  data  errors  component  III  can  be  estimated  based  on  studies  of 
instrument  errors  in  the  field.  It  is  stressed  that  some  QC  (quality  control) 
procedures,  such  as  checking  the  voltage  output  of  an  anemometer  at  a  given 
rotation  rate,  do  not  tell  much  about  actual  instrument  error  in  the  field. 
These  actual  errors  are  best  determined  through  use  of  co-located  instruments 
and  comparlsion  with  high  quality  "base-line"  instruments.  In  most  field 
experiments,  this  option  is  not  practical. 

1.  Uncertainty  in  Tracer  Gas  Observations 

Information  on  data  errors  in  tracer  gas  observations  was  obtained 
using  co-located  instruments  as  part  of  the  study  of  dispersion  from  tall 
stack  plumes  sponsored  by  the  Electric  Power  Research  Institute  (EPRI).  Table 
9  gives  some  figures  on  uncertainties  for  source  emissions  and  tracer  gas 
observation  instruments  (Reference  68). 
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TABLE  9.  UNCERTAINTIES  IN  TRACER  GAS  OBSERVATIONS  AT  EPRI  KINCAID  SITE  (FROM 
REFERENCE  67) 

Parameter 

Source  SF_ 
o 

Monitored  SFe 

b 

Wind  direction  to  max. 
observed  C 

The  wind  direction  error  is  obtained  by  calculating  the  difference 
between  the  observed  wind  direction  and  the  direction  from  the  source  to  the 
receptor  with  the  maximum  observed  concentration.  The  site  consists  of  flat 
farmland  with  some  manmade  lakes  Interspersed  in  the  area,  and  the  instruments 
were  all  research  grade  monitors  operated  by  technicians.  Thus  these  data 
errors  represent  the  minimum  of  what  would  be  found  in  a  monitoring  network  in 
the  neighborhood  of  an  Industrial  site.  In  terms  of  Equation  7,  the  minimum 

[L  2  -  2l  1/2 

<rc  /  Cq  is  expected  to  be  about  0.06.  It  is  Important  to  note 

o  * 

that  this  relative  error  grows  quickly  at  concentrations  near  the  threshold  of 
the  instrument. 

2.  Uncertainty  in  Meteorological  Data 

Typical  uncertainties  associated  with  the  measurement  of 
meteorological  data  have  been  addressed  in  several  studies.  The  results  from 
five  studies  are  reviewed  here. 

The  earliest  of  this  group  of  studies  is  the  Prairie  Grass  project. 
Although  several  different  organizations  were  Involved  in  making 
meteorological  measurements,  the  project  report  (Reference  60)  cites  accuracy 
assessments  for  only  the  slow  response  MIT  measurements  taken  at  an  elevation 
of  2  meters.  The  second  reference  reviewed  is  the  report  from  the  workshop  on 
on-site  meteorological  measurements  (Reference  69).  Although  not  concerned 
with  any  one  specific  field  program,  the  attendees  reported  on  their 
collective  experience  in  making  meteorological  measurements.  The  third 
reference  is  a  study  that  was  specifically  designed  to  compare  measurements 
obtained  from  5  types  of  mechanical  wind  sensors,  and  a  sonic  anemometer 
(Reference  70).  The  test  instruments  were  mounted  on  separate  10  meter  tall 


Standard  Deviation  or  Other  Limit 
355 

6%  for  C  >  100  ppt,  6  ppt  for  C  <  100  ppt 
20* 
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masts,  set  approximately  5  meters  apart  (across  the  flow).  The  fourth 
reference  assesses  measurements  made  in  support  of  the  EPRI  plume  model 
validation  study  at  the  Kincaid  and  Bull  Run  sites  (Reference  68),  while  the 
fifth  includes  a  report  on  measurements  made  at  Dugway  Proving  Grounds 
(Reference  71).  The  latter  report  is  particularly  interesting  in  that 
comparisons  are  made  between  "identical"  wind  instruments  installed  500  meters 
apart. 


Table  10  presents  information  obtained  from  each  of  the  references. 
All  of  the  references  noted  that  the  mechanical  wind  sensors  are  not  as 
reliable  for  wind  speeds  of  less  than  approximately  2  m/s  because  of  starting 
thresholds  and  response  times.  As  stated  in  the  workshop  report  (Reference 
69),  typical  thresholds  attainable  with  cup  and  vane  instruments  are  on  the 
order  of  half  a  meter  per  second.  The  workshop  report  also  brought  up  the 
subject  of  sampling  rate.  For  averaging  times  of  60  minutes,  60  or  more 
samples  will  estimate  the  mean  to  within  5  to  10  percent,  and  360  or  more 
samples  will  estimate  the  standard  deviation  to  within  5  to  10  percent. 

Because  the  sampling  density  and  accuracy  are  related  to  the  time-scales  of 
the  measured  variable,  similar  statements  for  shorter  averaging  times  such  as 
10  to  20  minutes  are  not  applicable.  Most  data  collection  systems  used  in 
these  studies  sample  the  speeds  and  directions  at  least  once  per  second. 

The  BA0  study  results  (Reference  70)  are  predicated  on  the  belief 
that  the  sonic  anemometer  provides  the  best  attainable  measurements  of  wind 
speed  and  direction,  so  that  measurements  from  the  mechanical  systems  are  only 
compared  with  those  from  the  sonic  anemometer.  The  scatter  found  in  wind 
direction  measurements  (4.5°),  relative  to  the  sonlcs,  is  surprisingly  large. 
This  caused  one  of  the  authors  of  the  study  to  look  at  the  comparability  of 
the  mechanical  systems  among  themselves.  His  analysis  is  not  complete,  but  in 
studying  one  20-mlnute  period,  he  finds  that  the  vanes  indicate  directions  to 
within  1°,  blvanes  to  within  1.5°,  and  cups  indicate  speeds  to  within  .1  m/s, 
when  compared  with  the  prop-vane  used  in  the  study.  No  Justification  has  been 
presented  for  discounting  the  discrepancy  between  the  sonics  and  the 
mechanical  systems. 

With  regard  to  the  <rfl  data  obtained  from  the  BA0  report,  we  note  that  the 

bias  in  v  (vane)  as  a  function  of  <r  (sonic)  departs  from  near-zero  rapidly 

0  0 
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TABLE  10.  TYPICAL  UNCERTAINTIES  IN  METEOROLOGICAL  MEASUREMENTS 


u  s 

'e 

% 

AT1 

Ave. 

Tikme 

Prairie  Grass 

2-5%  2°-5° 

10% 

- 

- 

10 

min. 

(u  >  2  m/s) 

Workshop 

.  2  m/s+5%  <3° 

5-10% 

- 

<0. 1°C 

60 

min. 

BAO2 

.3  m/s  4.5° 

3° 

1.7° 

- 

20  min. 

BAO3 

- 

10% 

- 

- 

20 

min. 

EPRI 

. 1  m/s  1.5° 

13% 

20% 

- 

60 

min. 

EPRI  ( tower 

10%  10° 

- 

- 

- 

60 

min. 

shadow) 

DPG  (mfg. 

.1  m/s+1%  3° 

1.2° 

- 

.  4°C 

- 

specs ) . 

DPG  (u  >  S  m/s) 

6% 

- 

30% 

- 

10 

min. 

DPG  (u  <  2  m/s) 

25% 

- 

48% 

- 

10 

min. 

DPG  (tower 

9%  (u  >  5  m/s) 

- 

- 

- 

10 

min. 

wake) 

<4%  (u  <  2  m/s) 

- 

- 

- 

10 

min. 

1.  The  temperature  difference  (AT)  uncertainty  applies  to  any  height 

interval,  since  the  Instrument  measures  only  a  difference  without  regard 
to  the  location  of  the  two  points. 


2.  Mean  bias  removed  (cups,  props,  vanes,  bi-vanes  relative  to  sonics). 


3.  Bias  as  a  function  of  indicated  <r  removed  (vanes  relative  to  sonlcs). 

0 


for  <r  greater  than  30°,  and  the  standard  deviation  in  the  difference  between 

the  mechanical  systems  and  the  sonic  grows  at  the  same  rate.  To  obtain  a 

better  estimate  of  the  uncertainty  in  <r  measurements  with  a  correction  for 

the  bias,  valid  throughout  the  range,  we  use  a  very  simple  representation  of 

the  bias  curve  for  <rQ  greater  than  30°  to  adjust  the  root -mean-square  error 

that  was  reported.  With  the  correction,  the  root-mean  square  error  is 

approximately  1°  +  <r  /12  for  <r  greater  than  30°.  For  cr  less  than  30°,  the 

root-mean-square  error  is  approximately  <r  / 9.  Hence,  the  uncertainty  in 

<r  (vane)  relative  to  the  measurements  from  the  sonic  anemometer  is  about  10%, 

0 

as  indicated  in  the  table. 

Both  the  EPRI  (Reference  68)  and  DPG  (Reference  71)  reports  include  a 
measure  of  the  effect  of  a  support  tower  on  wind  measurements.  Although  this 
is  an  important  consideration  for  general  data  accquisltlon  requirements,  it 
is  of  limited  importance  for  short-term  experiments  in  which  the  field  study 
can  be  conducted  only  for  periods  in  which  the  Instruments  are  properly 
exposed.  When  exposure  is  a  problem,  both  studies  indicate  that  the  effect  on 
wind  speed  can  be  as  large  as  10  percent  (indicated  speeds  are  lower  by  about 
10  percent). 

The  DPG  report  by  White  et  al.  (Reference  71)  provides  some 
information  on  uncertainties  from  lack  of  representative  sampling.  The  word 
‘representative’  refers  to  whether  the  measurement  at  a  given  location  would 
agree  with  a  concurrent  measurement  in  the  same  general  area.  In  looking  at 
the  effect  of  tower  shadows  or  wake  zones,  they  also  present  statistics  for 
periods  when  the  instrumentation  on  neither  tower  is  in  the  shadow.  Under 
these  conditions,  differences  between  "identical"  instruments  located  500 
meters  apart  are  more  likely  the  result  of  differences  in  the  flow,  rather 
than  differences  in  response  or  calibration.  The  data  presented  were  obtained 
from  cup  anemometers,  and  bivanes  mounted  16  meters  above  the  ground.  As 
indicated  in  Table  10,  uncertainty  in  wind  speed  is  about  6  percent  for  speeds 
greater  than  5  m/s,  and  it  rises  to  25  percent  for  wind  speeds  less  than  2 
m/s.  The  other  variable  reported  is  <r^,  the  standard  deviation  of  the 
vertical  elevation  angle.  For  speeds  greater  than  5  m/s,  the  uncertainty  is 
found  to  be  30  percent.  For  speeds  less  than  2  m/s,  the  uncertainty  grows  to 
48  percent.  Although  the  data  are  not  reported,  the  same  type  of  information 
on  8  and  arQ  should  be  available,  and  should  be  analyzed. 
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In  general,  the  degree  of  uncertainty  in  wind  measurements  reported 
in  these  five  documents  Is  consistent.  Without  considering  representativeness 
Issues,  wind  speed  measurements  are  generally  within  5  percent  of  the  "true" 
value,  and  wind  directions  are  within  3°  to  5°.  The  recent  studies  of  lateral 
turbulence  ( <rQ )  also  tend  to  confirm  expectations  based  on  experience  (e.g. , 
the  workshop  report  and  the  Prairie  Grass  report),  with  an  uncertainty  in  <r 

of  about  10  percent.  But  vertical  turbulence  Co- , )  data  appear  to  be  more 

9 

unreliable.  It  appears  that  9.  carries  with  It  an  uncertainty  of  about  20 

9 

percent,  even  when  efforts  are  made  to  assure  that  the  vane  or  prop  is 
functioning  properly. 

When  representativeness  is  considered,  the  uncertainty  grows 
appreciably.  Problems  of  exposure,  such  as  tower  shadows  or  wakes,  can  easily 
double  the  uncertainty  in  wind  speed  and  direction.  Worse,  measurements  taken 
at  one  point  can  differ  substantially  from  measurements  made  using  Identical 
instrumentation  located  Just  500  meters  away.  This  Is  particularly  true  of 
turbulence  measurements,  and  certainly  wind  speed  measurements  made  under 
light  wind  speeds  (less  than  2  m/s).  Although  not  documented,  differences  in 
wind  direction  are  expected  to  be  equally  sensitive  to  spatial  variations 
under  light  wind  speed  conditions. 

Alternate  methods  for  the  measurement  of  vertical  turbulence  (<r.)  are 

9 

frequently  employed,  given  the  difficulty  of  obtaining  reliable  data  on  <r. 

9 

from  vanes  and  props.  Stability  class  Is  sometimes  used  as  an  alternate 

method.  Inferred  from  observations  of  wind  speed  (near  the  ground),  and 

surrogates  for  the  sensible  heat  flux.  An  estimate  of  the  uncertainty  in  <r, 

9 

that  arises  In  the  use  of  these  methods  can  be  obtained  by  assuming  that  the 

resolution  In  the  resulting  stability  class  or  category  Is  no  better  than  one 

half  a  class.  The  Briggs  (Reference  72)  dispersion  parameter  curves  for  <r.  In 

9 

rural  areas  contain  a  leading  coefficient  for  each  class  that  is  essentially  a 

mean  <r.  (In  radians)  for  the  category.  If  a  linear  trend  is  computed  for 

9 

these  coefficients,  from  class  B  to  class  F,  its  slope  Is  approximately  .02 
radians/class.  Therefore,  an  uncertainty  of  half  a  class  produces  an 
uncertainty  of  approximately  .01  radians,  or  20  percent  of  the  mean  <r^  for  the 
entire  range  (class  B  to  class  F).  Therefore,  an  uncertainty  of  about  20 
percent  would  be  associated  with  the  use  of  surrogate  methods  (via  the 
stability  class)  for  <r,,  if  the  variability  in  a •  within  each  class  were 
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ignored.  But  Luna  and  Church  (Reference  73),  among  others,  show  that  the 

scatter  in  observed  values  of  <r  associated  with  each  stability  class  is  so 

9 

great,  that  any  measured  or.  could  belong  to  any  one  of  the  stability  classes 

9 

selected  on  the  basis  of  the  surrogate  methods. 

3.  Air  Force  Meteorological  Data  Errors 

As  part  of  this  project,  Capt.  L.  Key  (AFESC)  contacted  several  U. S. 
Air  Force  personnel  in  an  attempt  to  determine  QA/QC  procedures  and  expected 
meteorological  data  errors.  Capt.  M.  Davenport’s  memo  of  27  October  1987 
contains  the  results  of  this  survey,  which  is  briefly  summarized  below: 

Average  Air  Force  Base: 

Wind  Direction  Errors:  +  2° 

Wind  Speed  Errors:  +  1.5  KT 

Hygrothermograph  Errors:  +  2°F  (temp. ) 

♦  1.5°F  (dew  point) 

Calibration  Interval  -  approximately  1  month. 

Quality  Control  (l.e.,  double  checking  of  data)  -  none. 
Reporting  procedures: 

Wind  direction  is  rounded  to  nearest  10°  and  1  KT. 

Actual  reading  is  based  on  a  1-mlnute  averaging  period 
(Airways)  or  10-minute  averaging  period  (METAR) 

Mesowind  Networks  at  Vandenberg,  Edwards,  and  Patrick: 

Edwards  -  Maintenance  of  19  -  tower  net  unsatisfactory 

Patrick  -  WS  threshold  -  2  m/s 

WS  errors  +  . 15  mi /hr 

WD  errors  +  3° 

Temperature  Some  +  .35°C;  Some  ♦  3°C 

Calibration  Interval  -  one  month 
Quality  Control  -  Flagging  software 
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Vandenberg  -  WS  Error  +  1  KT 

WD  Error  ♦  2° 

Temp.  Error  +  1°F 

No  routine  calibration  -  only  electronics  checks. 

C.  STOCHASTIC  OR  TURBULENT  UNCERTAINTIES. 

Discussions  of  concentration  fluctuations  and  the  Influences  of  averaging 
and  sampling  times  are  either  nonexistent  or  very  minimal  in  the  research 
reviewed  to  this  point.  Some  models  grossly  parameterize  this  effect  (term  II 
in  Equation  7)  by  assuming  that  the  ratio  of  the  peak  (fluctuating) 
concentration  to  the  model  predicted  mean  concentration  is  about  two.  Chatwin 
(Reference  74)  pointed  out  that  in  many  cases  Involving  accidental  releases  of 
hazardous  gases,  the  maximum  short  term  (~1  sec)  concentration  is  the  most 
important  variable  to  predict.  Lung  damage  from  H^S  can  occur  with  one  breath 
if  the  concentration  is  sufficiently  high,  and  an  explosion  of  gas  from  an  LNG 
accident  can  occur  if  a  spark  is  struck  in  a  small  volume  of  gas  at  the 
flammability  limit.  According  to  Chatwin  the  mean  concentration  predicted  by 
the  model  can  be  irrelevant  In  these  cases,  since  the  probability  distribution 
function  (pdf)  of  concentration  fluctuations  in  the  atmosphere  is 
characterized  by  a  standard  deviation  at  least  as  large  as  the  mean.  The 
relative  magnitude  of  short-term  concentration  fluctuations  («r  /C)  is  the  same 
order  as  the  relative  magnitude  of  short-term  velocity  fluctuations  (a^/U)  in 
the  atmosphere.  The  parameters  or ■  and  are  the  standard  deviations  of 
turbulent  fluctuations  in  concentration  and  wind  speed,  respectively.  It  is 
assumed  that  averaging  times  are  about  1  second  and  sampling  times  are  about 
10  minutes.  Thus  it  is  important  to  predict  the  upper  end  of  the  pdf  for  the 
H^S  and  LNG  Incidents  described  above.  Since  Chatwin’ s  article  was  published, 
a  few  other  researchers  have  studied  this  problem,  although  a  comprehensive 
operational  model  has  not  been  derived. 

Predictions  of  models  such  as  AFTOX  or  DEGADIS  cam  be  thought  of  as 
ensemble  means  for  certain  averaging  times.  An  ensemble  mean  is  defined  as 
the  mean  over  an  infinite  number  of  realizations  of  a  given  experiment.  The 
averaging  time  is  usually  implicit  In  the  data  used  by  the  model  and  in  its 
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formulations  for  treating  the  Input  data  -  for  example.  If  hourly  averaged 
wind  and  turbulence  observations  are  used,  then  the  predictions  represent  a 
1-hour  average.  If  the  Pasquil 1-Gif ford-Turner  dispersion  curves  are  used, 
then  the  predictions  represent  a  10  minute  average,  since  data  from  10  minute 
periods  were  used  to  derive  the  curves.  In  the  case  of  instantaneous  (puff) 
models,  the  predictions  represent  an  ensemble  mean  only  to  the  extent  that  a 
large  enough  set  of  experiments  (20  or  more)  was  used  to  derive  the  model. 
These  experiments  should  be  conducted  under  the  same  external  conditions  (that 
is,  wind  speed,  stability,  source  term).  For  example,  if  it  were  possible  to 
run  the  Thorney  Island  experiments  long  enough  that  100  Independent  time 
periods  of  10-minute  duration  could  be  found  which  all  satisfy  the  following 
conditions: 

4.8  <  u  <  5.2  m/s,  65%  <  RH  <  70% 

10°  <  Ta  <  12*C,  10*  <  T  surface  <  12*C 

2 

-2  <  net  radiation  flux  <  2  watts/a 

(p  -p  )/p  *  2,  h*10m,  R*10m, 

p  81  8L 

then  the  observed  concentration  field  averaged  over  these  100  experiments 
would  approach  an  ensemble  average.  It  is  obvious  that  it  is  difficult 
operationally  and  financially  to  generate  ensemble  averages  from  atmospheric 
field  experiments. 

Thus  the  results  of  a  single  experiment,  or  even  three  or  four 
experiments  conducted  under  similar  external  conditions  will  likely  differ 
(perhaps  by  as  much  as  an  order  of  magnitude)  from  the  ensemble  mean 
predictions  of  the  model.  If  this  happens,  it  is  not  an  indictment  of  the 
model  but  may  be  a  manifestation  of  the  inherent  stochastic  variability  of  the 
atmosphere. 

Wind  tunnel  experiments  can  be  used  to  study  variability,  since  it  is 
easier  to  insure  repeatability  of  experiments  and  thus  create  a  large  ensemble 
of  data.  On  the  negative  side,  the  wind  tunnel  cannot  simulate  larger  scale 
eddies  and  other  phenomena  that  contribute  to  variability  in  the  atmosphere. 
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Furthermore,  the  laboratory  Reynolds  number  is  not  high  enough  to  permit  the 

establishment  of  an  inertial  subrange  like  there  is  in  the  atmosphere. 

Meroney  and  Lohaeyer  (Reference  75)  conducted  extensive  studies  of  dense  gas 

clouds  released  in  a  wind  tunnel  and  calculated  the  stochastic  or  random 

concentration  fluctuation  intensity,  e-  /£,  for  various  source  volumes,  wind 

speeds  and  downwind  distances.  An  example  of  these  results  is  plotted  in 

Figure  11,  showing  that  the  average  cr  /C  is  about  0.3  in  this  wind  tunnel. 

c 

The  value  of  <ryc  in  individual  experiments  range  from  0.1  to  0.7.  In 
contrast,  Hanna  (Reference  67)  reports  observed  values  of  <rc/C  of  1.5  on  the 
plume  centerline  and  o^/C  of  5.0  on  the  plume  edges  for  a  smoke  plume  released 
in  the  atmospheric  boundary  layer  and  concentrations  averaged  over  one 
second. . 

The  probability  distribution  function  (pdf)  of  concentration  fluctuations 
in  the  atmosphere  has  been  studied  by  several  persons  (References  67,  76,  and 
77),  and  all  agree  that  the  distribution  is  non-Gaussian  and  is  skewed  towards 
higher  concentrations.  For  hazardous  gas  analysis,  we  are  usually  interested 
in  the  probability  P(C>C^)  that  the  concentration  is  higher  than  some  limiting 
value,  C^: 

P(C>CL)  -  J"  p(C)  dC  (8) 

CL 

It  has  been  suggested  by  various  persons  that  the  probability  distribution 
function,  p(C),  can  be  approximated  by  a  log-normal,  clipped  normal,  or  Gamma 
function.  The  exponential  function  is  a  special  case  of  the  Gamma  function, 
and  is  quite  good  for  intermittent  clouds  or  plumes.  Another  important  factor 
is  the  intermlttency,  I,  which  is  defined  as  the  fraction  of  time  that 
non-zero  concentrations  are  observed  at  a  monitor.  For  the  exponential 
distribution,  equals  one  if  the  intermlttency  is  unity.  In  the  general 

case,  the  pdf  is  given  by  the  formula: 

p(C)  -  (I2/C)exp(-IC/C)+(l*I)5(0)  (9) 

where  the  Dirac  delta  function  5(0)  equals  1.0  at  C  equal  to  0  and  equals  0.0 
elsewhere.  This  can  be  substituted  into  equation  (8)  to  give: 

P(C>CL)  =  I  exp(-IC/C)  (10) 
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Figure  11.  Concentration  Variance  Ratio,  <r  /C,  versus  Downwind  Distance, 
Observed  by  Meroney  and  Lohmeyer  (Reference  75)  in  a  Wind 
Tunnel.  The  source  is  an  Instantaneous  dense  gas  cloud  of 
initial  voluae  V. 


Thus,  if  the  dispersion  model  predicts  an  ensemble  mean,  C,  of  0. 1C^,  where  CL 
Is  the  threshold  concentration  for  some  health  effect,  and  the  intermittency 
I  equals  0.5,  then  the  probability  that  the  instantaneous  C  will  exceed  is 
0.3  percent.  If  the  ensemble  mean  prediction  is  0.5C^,  then  this  probability 
is  18  percent. 

The  formulas  given  above  are  for  nearly- Instantaneous  averaging  times. 

It  is  clear  that  the  standard  deviation  of  concentration  fluctuations,  <r  , 

c 

will  decrease  as  averaging  time  T  increases.  If  the  integral  time  scale  of 
the  concentration  fluctuations  is  Tj  and  their  autocorrelogram  is  assumed  to 
be  exponential,  then  the  following  formulas  apply: 

R(t')  *  eT7tTC7Tt+tT7/«r2  «  exp(-t'/T.)  (11) 

c  i 


Then 


°c  (T)/<rc  (0)  “  2(Tj/T) ( l-tTj/T) ( 1-expC-T/Tj ) ) )  (12) 

2 

where  <rc(0)  refers  to  the  variance  for  instantaneous  averaging  time.  If  Tjhas 

a  typical  value  of  10  sec  for  the  surface  layer  then  the  ratio  of  variances 

for  an  averaging  time  of  T  equal  to  60  sec  is  0.28.  This  estimate  of  Tj  is 

based  on  observations  of  concentration  fluctuations  during  smoke/obscurant 

experiments  conducted  by  the  U.  S.  Army  (Reference  67).  If  the  averaging  time 

is  one  hour,  the  ratio  <r  ( 3600s )/<r  (0)  is  0.075.  It  can  be  concluded  that  the 

c  c 

fluctuation  intensity  o^/C  for  1-hour  averages  in  the  atmosphere  is  about 

0.1  even  if  the  integral  time  scale  is  only  a  few  seconds.  For  fluctuations 

dominated  by  lateral  meandering,  where  Tj  is  more  like  100  to  1000  seconds, 

the  fluctuation  intensity  <r  /C  for  one  hour  averages  is  approximately  0.5. 

c 

If  it  is  assumed  that  the  equations  in  the  first  part  of  this  section 
produce  predictions  of  ensemble  mean  concentrations,  C,  then  the  probability 
of  the  concentration  exceeding  any  threshold  limit,  C^,  can  be  estimated  using 
equations  (8)  through  (12)  for  any  averaging  time  and  Integral  time  scale. 

Equation  (12)  can  be  used  to  assess  the  effects  of  averaging  over 
distances  as  well  as  time.  Observed  concentrations  and  health  effects  always 
involve  some  averaging  distance.  For  example,  if  the  integral  distance  scale 


65 


of  the  turbulence  is  5  asters  and  the  averaging  distance  is  1  meter,  then  the 
2  2 

ratio  or  (lm )/or  (0)  equals  0.94. 
c  c 

At  the  other  end  of  the  scale  the  sampling  time  or  sampling  volume  can 
also  Influence  observations.  The  sampling  time,  T  ,  can  be  thought  of  as  the 
total  length  of  time  that  the  instrument  is  turned  on.  The  likelihood  of  more 
extreme  concentrations  being  observed  is  increased  if  the  sampling  time 
increases  (for  example,  several  new  “record"  high  and  low  temperatures  are 
observed  at  any  given  weather  station  each  year).  The  usual  definition  of  any 
ensemble  assumes  that  the  sampling  time  is  Infinity.  In  practice  this 
requirement  is  considerably  relaxed,  such  that  a  set  of  10  dense  gas 
experiments  conducted  during  similar  external  conditions  is  assumed  to 
comprise  an  ensemble.  Equation  (12)  cam  also  be  used  to  calculate  the 
variance  "missed"  by  an  instrument  because  it  is  turned  on  for  a  finite 
sampling  time  Tg: 

<rf  (0,  T  )/o?  (0.  »)  -  1-2(T./T  )(1-(Tt/T  ) ( ( l-exp(-T  /T. ) ) )  (13) 

c  sc  Is  Is  si 

2 

where  the  first  variable  inside  the  parentheses  after  or ^  is  the  averaging  time 
and  the  second  variable  is  the  sampling  time.  Any  eddies  with  time  scales 
much  larger  than  Tj  are  not  detected  by  the  Instrument.  For  example,  if  Ts  is 
10  times  the  integral  scale  Tj,  then  only  82  percent  of  the  total  possible 
variance  is  seen.  If  both  the  sampling  time  Tg  and  the  averaging  time  T  are 
finite  (as  they  are  in  any  experiment)  then  the  fraction  of  the  total  possible 
variance  can  be  calculated  by  multiplying  Equations  (12)  and  (13)  together. 

An  example  is  given  in  Figure  12  for  the  special  case  Tg/T  *  100  (for  example, 
averaging  time  could  be  one  minute  and  sampling  time  could  be  100  minutes). 
This  function  clearly  defines  a  "window",  with  high  and  low  frequency 
fluctuations  filtered  out  by  the  finite  sampling  and  averaging  times. 

D.  MODEL  PHYSICS  ERRORS  IN  TYPICAL  HAZARD  RESPONSE  MODELS 

The  most  difficult  component  of  the  uncertainty  to  evaluate  is  the 
component  due  to  model  physics  errors.  This  can  be  calculated  as  a  resultant 
from  Equation  (7),  but  has  a  great  potential  for  uncertainty  itself  because 
the  stochastic  and  data  input  uncertainties  are  of  the  same  order  of  magnitude 
as  the  total  uncertainty.  To  determine  whether  the  inclusion  of  a  specific 
model  physics  component  (along  with  the  resultant  increase  in  uncertainty  that 
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Figure  12.  Fraction  of  Total  Possible  Concentration  Fluctuation  Variance 
from  Equations  (9)  and  (10),  as  a  Function  of  Sampling  Time 
T  ,  Averaging  Time  T  and  Integral  Time  Scale  T_.  It  Is  Assumed 
tftat  T^/Tj  -  100. 
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I 


is  associated  with  its  input  data  requirements)  reduces  or  increases  the  model 

errors,  we  use  the  model  performance  parameter  CQ/Cp,  where  CQ  is  the  observed 

concentration  and  C  is  the  concentration  predicted  by  some  component  or  model. 
P 

The  variance  of  C  /C  is  calculated  and  defined  as  var(Model  i),  where  Model 
o  p 

i  could  be  defined  in  any  manner.  Rafferty  and  Dumbauld  (Reference  78)  also 

looked  at  the  variance  in  the  ratio  C  /C  ,  from  the  perspective  of  assessing 

o  p 

the  variability  in  the  source  term  among  many  smoke  dispersion  trials. 


If  Model  1  is  given  by  the  simple  relation  *  1,  then  var(Model  1) 
equals  the  variance  of  the  observed  concentration  Cq.  Obviously  it  is  desired 
that  the  variance  of  any  other  model  be  less  than  this  amount.  If  the 
new  model  increases  the  variance,  then  it  clearly  has  not  demonstrated 
any  skill.  Other  models  can  be  quite  simple  or  can  be  defined  by  standard 
models  such  as  OB/DG  or  AFTOX: 


P 

Cp  «  aQ 

C  *  bQ/u 
P 

C  ■  c/u 

P  2 

0^  ■  dQ/(slgw*sigv*x  *u) 

C  -  OB/DG 
P 

where  a,  b,  c,  and  d  are  constants. 

We  use  our  knowledge  of  dispersion  to  formulate  the  first  five  models. 

If  the  variance  of,  say,  C  /C  for  Model  3  is  not  less  than  the  variance  of 

o  p 

C  /C  for  Model  2,  then  this  addition  of  the  wind  speed,  u,  is  not  important 
o  p 

for  the  model  predictions  compared  to  the  uncertainty  it  introduces  (the 
uncertainty  in  measuring  the  representative  value  of  the  transport  speed).  An 
application  of  this  procedure  using  the  Prairie  Grass  dispersion  observations 
is  given  below. 

The  data  from  the  Prairie  Grass  experiments  provide  us  with  a  set  of 
tracer  concentration  and  meteorological  data  for  53  trials.  Data  from  other 
Prairie  Grass  trials  are  available,  but  these  data  are  incomplete,  and 
estimation  of  the  stability  class  using  the  Colder  (Reference  79)  method  was 
applied  only  to  the  set  of  53  trials  that  we  are  analyzing.  The  stability 


Model  2 
Model  3 
Model  4 
Model  5 
Model  6 
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class  is  important  In  that  wind  fluctuations  in  the  vertical  were  not 
measured,  so  these  need  to  be  inferred  from  the  stability  class  index. 


Models  for  Cp  are  evaluated  first  by  taking  the  independent  variables 
separately.  For  each  model,  the  variance  in  the  ratio  Cq/C^  is  computed,  and 
scaled  by  the  average  ratio  (squared)  so  that  the  variance  associated  with 
each  of  the  models  can  be  compared.  That  is,  the  scaled  variance  is  given  by: 


svar  (C  /C  ) 
o  p 


var  (C  /C  ) 
_ _ o  P 

(cTcj2 

o  p 


These  scaled  variances  are  reported  in  Table  11. 

The  first  model,  *  1,  provides  us  with  the  scaled  variance  in  the 
observed  concentrations,  which  equals  2.2  in  this  case.  Each  of  the  remaining 
models  tests  the  importance  of  including  emissions  and  plume-growth  variables 
one  at  a  time.  Note  that  the  standard  deviation  of  vertical  angle 

fluctuations,  <r,,  is  inferred  from  the  stability  class  by  taking  the  leading 

9 

coefficient  of  the  curve  for  «r  proposed  by  Briggs  (Reference  72)  for  each 

stability  class,  and  interpolating.  Each  variable  except  the  emission  rate, 

2 

Q.  and  the  square  of  the  distance,  X  ,  reduces  the  scaled  variance.  This 

2 

suggests  that  a  model  that  uses  either  Q  or  x  may  not  perform  well. 


This  conjecture  is  evaluated  by  formulating  several  models  that  combine 
the  variables.  These  are  listed  in  Table  12.  The  first  combines  the  two 
variables  that  exhibited  the  greatest  reduction  in  the  scaled  variance,  wind 
speed,  u,  and  distance  from  the  source,  x.  Their  product  produces  a 
relatively  small  scaled  variance  of  0.666.  When  the  wind  fluctuation 

variables  <r,  and  or  are  introduced  as  well,  the  scaled  variance  rises  to 

*  6 

0.753.  This  combination  would  suggest  the  following  square  root  dependencies 
of  <r  and  <r  :<r  *  <r  /  x  and  «r  *  <r x  .  However,  application  of  this  model 
in  the  third  line  of  the  table  shows  no  improvement  over  the  simpler  model. 
The  next  pair  of  models  are  similar  to  the  previous  pair,  except  distance  is 
introduced  as  square  (linear  growth  in  cr  and  <r  with  distance).  Now  it  is 

2  y  2 

evident  that  even  though  the  x  model  performed  poorly  in  Table  11,  it 

performs  very  well  when  coupled  with  the  turbulence  variables,  producing  a 

scaled  variance  of  0.514.  Thus,  it  appears  that  <r  and  <r_  are  each 

y  * 


TABLE  11.  COMPARISON  OF  SCALED  VARIANCES  OF  C0/Cp  FOR  SEVERAL 

SINGLE-VARIABLE  MODELS  OF  C  (PRAIRIE  GRASS  DATA) . 

P 


Model  for  C 

- p 

Scaled  Variance 

S' 1 

2.200 

cp " 0 

2.979 

C  -  1/u 

P 

1.652 

C  *  l/<r. 

P  0 

1.875 

C  -  1/  <r , 

P  * 

1.860 

C  *  1/x 

P 

1.356 

C  =»  1/x2 

4.674 

P 


•  Scaled  Variance  *  Variance  (CQ/Cp) / (C^cT)2 
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proportional  to  x.  A  third  model  for  plume  growth  appears  next.  In  this 

case,  o-y  and  cr^  sire  computed  from  the  stability  class  index  and  the  Briggs 

(Reference  72)  curves,  where  interpolation  between  the  curves  is  done  for 

fractional  stability  class  values.  This  model  does  not  reduce  the  variance  as 

much  as  the  linear  growth  model.  Finally,  the  results  of  the  AFTOX  model  for 

unit  emission  rate  are  included,  and  this  model  is  most  successful  in  reducing 

the  scaled  variance  (0.484).  Apparently,  the  additional  model  physics 

incorporated  in  AFTOX  relating  to  source-receptor  geometry  and  plume  growth 

rates  is  important  in  modeling  these  Prairie  Grass  data.  In  future  research, 

confidence  limits  on  these  scaled  variances  should  be  calculated  to  determine 

2 

whether  the  differences  between,  say,  AFTOX  and  the  l/<r  ,<r_  x  u  models  are 

(p  0 

significant. 

The  last  three  models  included  in  Table  12  contain  the  emission  rate, 
and  the  scaled  variances  increase  substantially  as  a  result.  Apparently,  the 
effective  emission  rates  for  these  trials  are  not  well  known,  and  including 
the  emission  rate  in  a  model  for  these  data  actually  degrades  model 
performance.  Use  of  a  mean  emission  rate  In  AFTOX  would  produce  better 
agreement  with  the  observations  than  the  use  of  the  stated  emission  rate  for 
each  of  the  trials. 
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TABLE  12.  COMPARISON  OF  SCAl.ED  VARIANCES  OF  C FOR  SEVERAL  MODELS  OF  Cp 

THAT  INVOLVE  COMBINATIONS  OF  MORE  THAN  ONE  VARIABLE  (PRAIRIE  GRASS 
DATA) 


Model  for  C 


Scaled  Variance 


C  *  l/(xU) 
P 


0.666 


C  -  l/(xUr  «r.) 
P  s  ♦ 


0.753 


C  *  l/(x  U) 
P 


1.804 


C  *  l/(x  Ucr  <r .) 
P  °  9 


0.514 


Cp  -  l/OJoy^) 


0.597 


C  *  AFTOX/Q 
P 


0.484 


C  -  Q/(x  Ikr  O'.) 
P  8  # 


1.030 


C  *  Q/(U<r  <r  ) 
P  y  z 


1. 103 


C  =  AFTOX 
P 


0.933 


•  Scaled  Variance  =  Var(C Q/C  ) / (Cq/C  )* 
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SECTION  IV 


FRAMEWORK  OF  MODEL  EVALUATION  PROCEDURE 

A.  OVERVIEW  OF  MODEL  EVALUATION  APPROACH 

A  major  part  of  this  effort  involves  the  development  of  a  framework  for 
the  evaluation  of  currently  available  microcomputer-based  hazard  response 
models.  This  evaluation  is  intended  to  employ  standard  statistical 
procedures,  and  provides  information  only  concerning  the  total  model  error. 

No  information  can  be  obtained  on  the  components  of  the  model  error  discussed 
in  the  previous  section.  Using  the  results  of  this  statistical  evaluation,  it 
is  possible  to  quantify  the  uncertainty  for  various  scenarios.  A  framework 
for  model  evaluation  has  been  developed  that  contains  several  options  at 
certain  key  steps.  These  options  have  been  tested  in  a  preliminary  way  in 
Phase  I,  and  may  be  further  refined  if  Phase  II  is  carried  out.  The  steps  in 
this  framework  have  the  following  form: 
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Sigma  Research  has  been  working  on  several  model  evaluation  procedures 
that  could  be  used  as  options  in  steps  4  and  5  of  the  framework  (References  38 
and  68).  Other  potential  procedures  have  been  suggested  by  Fox  (Reference 
80),  the  EPA  (Reference  81),  and  Cox  and  Tikvart  (Reference  82).  Specific 
model  performance  measures  are  listed  In  Section  IV.  8.  In  general  the  purpose 
of  this  exercise  is  to  answer  two  questions: 

1.  Are  the  model  predictions  signif icantly  different  from  the 
observations? 

2.  Are  the  predictions  of  two  models  significantly  different  from 
each  other? 

The  answers  to  these  questions  depend  on  prudent  applications  of  statistical 
procedures  for  calculating  confidence  limits. 

B.  DESCRIPTION  OF  PERFORMANCE  MEASURES 

For  many  purposes  a  straightforward  statistical  analysis  of  model 
performance  is  necessary.  In  this  section  the  recommendations  of  an  EPA/ AMS 
committee  (Reference  80)  are  followed  to  the  extent  that  a  limited  number  of 
performance  measures  is  employed.  Accurate  calculation  of  confidence  limits 
is  emphasized.  Scientific  review  of  the  models  Is  Just  as  Important  a s 
statistical  evaluations. 

1.  Description  of  Statistical  Analysis 

A  scheme  for  evaluating  air  quality  models  has  been  developed  by 
Hanna  and  Heinold  (Reference  83)  which  uses  a  modest  set  of  performance 
measures  and  which  emphasizes  estimation  of  confidence  limits  on  each 
performance  measure.  Two  basic  performance  measures  are  used.  The  first  is 
called  the  fractional  bias,  FB,  and  emphasizes  the  bias  in  the  mean  predicted 
concentrations  (see  Reference  82): 


(14) 


FB  =  (C  -  C  )/(0.5(£"  «■  C  )) 
op  op 

The  second  performance  measure  emphasizes  the  scatter  in  the  entire  data  set 
and  is  defined  as  the  normalized  mean  square  error,  NMSE  (see  Reference  83): 

NMSE  *  (C  -  C  )2/C  C  (15) 

p  o  op 

The  normalization  by  C  C  assures  that  NMSE  will  not  be  biased  towards  models 

o  p 

that  overpredict  or  underpredict. 

The  data  used  for  model  testing  should  be  independent  of  the  data 
used  for  model  development  and  should  not  be  serially  correlated.  For 
example,  it  would  be  preferred  that  only  1  hour  of  data  from  a  given  afternoon 
be  used,  since  the  data  from  the  other  hours  on  that  afternoon  are  undoubtedly 
correlated  with  each  other.  Unfortunately,  in  the  air  quality  business  data 
sets  are  seldom  large  enough  to  make  this  possible. 

Next  the  averaging  period,  sampling  period,  and  time  and  space 
pairing  of  the  data  set  must  be  chosen.  The  fractional  bias  and  the  NMSE  can 
be  calculated  for  all  data  paired  in  time  and  space  (i.e. ,  cp_c0  is  used  from 
each  monitor  and  each  time).  Weil  and  Brower  (Reference  84)  prefer  to  use  the 
maximum  observed  and  predicted  concentrations  at  each  averaging  period  on  a 
given  monitoring  arc,  but  this  method  requires  extensive  monitoring  over  wide 
arcs.  Or,  FB  and  NMSE  can  be  calculated  for  maximum  concentrations  at  given 
monitoring  locations  independent  of  time.  The  combinations  of  data  used  are 
determined  by  the  goals  of  the  study. 

Once  the  bias  and  NMSE  are  calculated  for  a  number  of  models  for  the 
given  data  set,  confidence  limits  should  be  calculated  to  answer  the  following 
questions: 


1.  Is  the  bias  significantly  different  from  zero? 

2.  Is  the  bias  for  Model  i  significantly  different  from  the 
bias  for  Model  J? 

3.  Is  the  NMSE  for  Model  1  significantly  different  from  the 
NMSE  for  Model  J? 
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Because  the  distributions  of  these  parameters  are  not  easily 
transformed  to  a  normal  distribution,  standard  analytical  procedures  for 
calculating  confidence  limits  may  not  be  accurate.  The  bootstrap  resampling 
procedure  described  by  Efron  (Reference  85)  is  used  instead.  EPA  scientists 
(Reference  82)  lias  begun  to  apply  the  bootstrap  procedure  to  air  quality  model 
studies,  and  many  other  examples  are  given  by  Hanna  and  Helnold  (Reference 
83). 


The  bootstrap  procedure  is  best  explained  using  a  simple 

example.  Suppose  a  set  of  100  pairs  of  observed  and  predicted  concentrations 

arc  available,  and  the  averages  C  and  C  are  calculated.  In  the  bootstrap 

o  p 

procedure,  100  new  pairs  are  selected  randomly  (with  replacement)  by  computer 

(or  by  hand,  if  you  have  the  time)  from  the  original  set  and  new  values  of  the 

averages  C  and  C  are  calculated.  This  is  done  100  to  1000  times  (depending 

u  P  _  _ 

on  the  computer  costs  and  availability),  giving  a  histogram  or  pdf  of  Cp-CQ. 

If  all  of  the  resampled  values  of  C  -C  are,  say,  greater  them  zero,  then  it 

p  o 

can  be  stated  with  great  confidence  that  C  -C  is  significantly  different  from 

p  o 

zero.  But  if  the  resampled  C  -C  distribution  crosses  zero  at  some  point 

P  ® 

between  the  5th  and  95th  percentiles,  then  it  cannot  be  stated  with  90  percent 

confidence  that  C  -C  is  significantly  different  from  zero.  This  method  of 
P  ° 

estimating  the  5th  and  95th  percentile  has  been  called  the  "seductive 
bootstrap"  by  some  statisticians,  who  recommend  the  alternate  procedure  of 
using  the  calculated  variance  to  determine  these  percentiles.  However,  we 
have  found  that  the  differences  in  these  methods  are  minor  for  most  air 
quality  data. 


An  example  of  an  application  of  this  procedure  is  given  in 

Figure  13,  where  seven  models  have  been  tested  with  the  EPRI  power  plant  data 

mentioned  above  and  the  distributions  of  the  mean  bias  are  drawn  as  Tukey 

"whisker-plots. "  It  cannot  be  stated  with  90  percent  confidence  that  the  mean 

bias,  C  -£  for  models  2  and  5  is  significantly  different  from  zero.  The 
P  ° 

C  -C  for  the  other  models  is  significantly  different  from  zero  by  this 
P  ° 

criterion. 


The  bootstrap  procedure  can  be  applied  to  any  statistic 
calculated  from  the  original  data  set.  While  the  use  of  the  bias  and  the  NMSE 
is  emphasized  above,  the  correlation,  r,  the  median,  or  any  other  parameter 
could  Just  as  easily  have  been  used.  Even  nonstandard  statistics  such  as  the 
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M  odei 


Figure  13.  Illustration  of  "Whisker-Plots"  of  Cumulative  Distribution 

Functions  of  Cp-Co  for  Seven  Different  Models,  as  Determined  by 
Bootstrap  Resampling.  It  is  Concluded  with  Better  than  95’/. 
Confidence  that  the  Difference  Cp-CQ  is  not  likely  to  be  Zero  for 
All  Models  Except  Numbers  2  and  5,  (i.e.,  the  5'/.  or  95*/.  Points  on 
the  Whisker  Plots  do  not  Overlap  Zero). 
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percent  of  predictions  within  a  factor  of  two  of  the  observations  can  be  used. 
A  generalized  algorithm  using  the  bootstrap  procedure  to  calculate  confidence 
intervals  on  the  statistics  FB  and  NMSE  has  been  written  for  the  IBM  PC 
microcomputer  and  is  available  on  floppy  disk  from  the  authors. 

C.  INTERPRETATION  OF  CONFIDENCE  INTERVALS 

One  of  the  primary  advantages  of  our  model  evaluation  procedure  is  its 
application  of  the  concept  of  confidence  intervals.  Some  analytical  methods 
of  calculating  confidence  limits  are  well-known.  For  example,  if  a  “parent 
distribution"  has  a  normal  or  Gaussian  distribution  with  mean  of  0.0  and 
standard  deviation  of  sigma,  and  many  samples  of  size  n  are  randomly  drawn 
from  this  distribution,  then  the  calculated  means  of  these  samples  will  have 
an  expected  mean  of  0.0  and  standard  deviation  sigma/Vn~.  This  is  called  the 
central  limit  theorem. 


If  a  calculated  mean  of  a  given  sample  of  size  n  is  M  and  the  standard 
deviation  is  S,  and  n  is  greater  than  about  30,  then  it  can  be  stated  that 
there  is  a  95  percent  chance  that  the  mean  of  the  parent  distribution  from 
which  the  sample  was  drawn  is  in  the  range  from  M  -  2S/Vn  to  M  ♦  2S/Vn~.  This 
is  called  the  Student  t  procedure,  and  is  summarized  in  many  texts  including 
Panofsky  and  Brier  (Reference  86). 


In  air  pollution  modeling,  it  is  often  asked  if  the  model  predictions  are 

significantly  different  from  the  observations.  For  example,  what  are  the 

chances  that  C  -C  would  equal  zero  for  a  given  set  of  pairs  of  C  and  C  ? 

p  o  p  o 

The  analytical  methods  (e.g. ,  central  limit  theorem)  described  above  for 

estimating  confidence  limits  are  based  on  the  presumption  that  the 

distribution  of  C  -C  is  normal  or  Gaussian.  If  it  Is  not,  the  analytical 
P  ® 

methods  begin  to  fail  and  should  be  replaced  by  a  resampling  procedure  (such 
as  the  bootstrap)  that  does  not  care  whether  the  distribution  Is  Gaussian  or 
not.  Section  IV. B  recommends  the  "seductive  bootstrap"  for  resampling  air 
quality  data,  but  the  "Jackknife"  or  the  "multlhalver"  could  be  used 
(Reference  87).  The  output  of  any  of  these  methods  is  an  estimate  of  the 
mean,  M,  and  the  standard  deviation,  S,  from  which  the  95  percent  confidence 
interval  equals  (as  before)  M  -  2S/Vn”  to  M  +  2S/Vn~  (for  n  >  30). 


For  example,  suppose  M 


C  -  C 
P  o 


10  pg/m  ,  S  =  40  pg/m  ,  and  n  =  100. 
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3  3 

Then  the  95  percent  confidence  Interval  on  M  ranges  from  2  ng/m  to  18  fig/m  . 

In  this  case  the  confidence  Interval  does  not  Include  zero  and  the 

Interpretation  is  that  the  mean  bias  C  -C  Is  slgnif Icantly  different  from 

P  ® 

zero  (with  95  percent  conf ideiu-e) .  This  conclusion  would  be  the  opposite  if  M 

—  —  3 

=  Cp-CQ  were  reduced  to  5  pg/m  . 

The  same  approach  is  applied  to  differences  between  models,  where 
M  =  Cp1~Cp2»  with  subscripts  1  and  2  representing  two  different  models. 
Experience  shows  that  significant  differences  between  models  occur  less 
frequently  than  intuition  would  suggest,  as  will  be  seen  in  the  next  section. 


SECTION  V 


PRELIMINARY  APPLICATIONS  OF  MODEL  EVALUATION  PROCEDURES 

The  model  evaluation  procedures  discussed  in  Section  IV  were  applied  to 
predictions  of  several  models  for  several  types  of  field  experiments.  Four 
specific  applications  are  described  in  this  section. 

A.  DENSE  CAS  MODEL  COMPARISONS 

Alp  et  al.  (Reference  88)  presented  Table  13,  which  contains  predictions 
of  four  dense  gas  models  for  six  runs  during  the  Maplln  Sands  LPG  experiments. 
We  did  not  Include  the  FEM3  model  in  our  comparisons,  since  it  did  not  make 
predictions  for  two  of  the  runs.  Table  14  contains  the  results  of  the  model 
evaluation,  showing  that  the  fractional  bias  FB  and  the  normalized  mean  square 
error  NM5E  are  quite  small  for  all  models.  In  fact,  FB  is  not  significantly 


TABLE  14.  CONCLUSIONS  FROM  PROPANE  RUNS  REPORTED  BY  ALP  (REFERENCE  75) 

MODELS: _ COBRA  III _ HEGADAS  II _ DEC  AD  IS 

FB  .002  0.09  .151 

NMSE  .09  .09  .10 


MODELS  FOR  WHICH  FB  IS  NOT  SIGNIFICANTLY  DIFFERENT  FROM  0  (AT  95  PERCENT 
CONFIDENCE  LEVEL):  ALL 

MODELS  FOR  WHICH  DFB  IS  NOT  SIGNIFICANTLY  DIFFERENT  FROM  0  (AT  95  PERCENT 
CONFIDENCE  LEVEL):  COBRA  III  -  HEGADAS  II. 

MODELS  FOR  WHICH  DNMSE  IS  NOT  SIGNIFICANTLY  DIFFERENT  FROM  0  (AT 
95  PERCENT  CONFIDENCE  LEVEL):  ALL 

Note  FB  «  (C  -  C  )/0.5  (C  +  C  ) 
op  op 

NMSE  *  (C  -  C  )2/C  C 
P  P  °P 
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:  Lower  Flanuiability  Limit  =  2.1% 


different  from  zero  (95  percent  c. 1.)  for  any  of  the  models.  However,  FB  for 
the  DEGADIS  model  Is  significantly  different  from  FB  for  the  other  two  models. 
The  three  NMSE  values  are  not  slgnlf Icantly  different  from  each  other.  It  Is 
concluded  that  all  three  models  perform  well  and  that  their  performance  Is  not 
significantly  different.  The  broad  confidence  limits  are  a  result  of  the 
relatively  small  number  of  data  that  were  used  (n  *  6). 

As  another  example.  Figure  14  shows  predictions  of  four  models  for  the 
Thorney  Island  Trial  14  freon  experiment  (from  Reference  49).  One  model 
(Eldsvlk)  appears  to  be  greatly  in  error,  and  the  other  models  have  much  less 
error.  Concentrations  at  five  times  after  release  were  considered.  The 
output  of  the  statistical  software  is  given  in  Table  15.  The  Eidsvik  model 
yields  a  fractional  bias  FB  that  is  significantly  different  from  zero  (95 
percent  c. i.)  and  is  different  from  the  FB  of  the  other  models.  The  NMSE  for 
the  four  models  are  not  significantly  different.  This  apparent  lack  of 
difference  in  NMSE  among  models  is  partly  due  to  the  small  sample  size  (n  *  5 
in  this  case).  The  smaller  n,  the  larger  the  confidence  interval,  as 
described  in  Section  IV  C.  Generally  about  100  separate  field  trials  are 
needed  to  show  significant  differences  among  similar  models.  It  can  be 
concluded  from  this  application  that  the  performance  of  all  four  models  is 
similar,  with  the  possible  exception  of  the  Eidsvik  model  as  an  outlier. 


TABLE  15. 

CONCLUSIONS 

FROM  THORNEY 

ISLAND  TRIAL 

14  DATA. 

MODELS: 

COX, 

EIDSVIK, 

HEGADAS, 

MARI AH 

FB 

-.27 

-1.36 

.  17 

-.29 

NMSE 

.  13 

.28 

.34 

.  13 

MODELS  FOR  WHICH  FB  IS  NOT  SIGNIFICANTLY  DIFFERENT  FROM  0  (AT  95  PERCENT 
CONFIDENCE  LEVEL):  ALL  BUT  EIDSVIK. 

MODELS  FOR  WHICH  DFB  IS  NOT  SIGNIFICANTLY  DIFFERENT  FROM  0  (AT  95  PERCENT 
CONFIDENCE  LEVEL):  COX-HEGADAS,  COX- MAR I AH,  KEGADAS-MARI AH. 

MODELS  FOR  WHICH  DNMSE  IS  NOT  SIGNIFICANTLY  DIFFERENT  FROM  0  (AT  95  PERCENT 
CONFIDENCE  LEVEL):  ALL. 


Note  FB  =*  (C  -  C  )/0.5  (C  +  £  ) 
op  op 


NMSE  =  (C  -  C  )2/C  C 
p  p  op 
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B. 


AFTOX  OB/DG  COMPARISONS 


The  new  AFTOX  model  (Reference  2)  is  Intended  to  replace  the  20-year  old 
OB/DG  model,  which  has  been  used  by  the  U. S.  Air  Force  to  calculate 
concentrations  and  hazard  corridors  for  many  years.  It  is  described  in  more 
detail  in  Section  II.  But  is  the  AFTOX  model  a  significant  improvement  over 
the  OB/DG  model?  Kunkel  (Reference  2)  tested  both  models  using  the  Prairie 
Grass,  Ocean  Breeze,  Dry  Gulch  and  Green  Glow  data,  and  kindly  supplied  us 
with  meteorological  data  and  model  predictions  for  each  experiment.  A  summary 
of  the  tests  is  given  in  Table  16. 

TABLE  16.  SUMMARY  OF  INERT  TRACER  TESTS. 


Test 

Trials 

Arc  Distances 

Conditions 

Ocean  Breeze 

Cape  Canaveral 

76  FP 

1.2,  2.4,  4.8  km 

Daytime 

Dry  Gulch 

Vandenberg 

109  FP 

Course  1:  2,  3,  5.7  km 

Course  2:  0.85,  1.5, 

4.7  km 

Daytime 

Green  Glow 

Hanford 

66  FP 

0.2,  0.8,  1.6,  3.2, 

12.8,  25.6  km 

Stable 

Prairie  Grass 

Nebraska 

70  S02 

0.05,  0.1,  0.2,  0.4, 

0.8  km 

Daytime 

This  evaluation  is  hampered  by  the  fact  that  the  OB/DG  model  is  in  fact 
an  empirical  model  derived  from  the  Ocean  Breeze,  Dry  Gulch,  and  Prairie  Grass 
data.  The  model  developers  (Reference  27)  did  separate  the  data  into  a 
developmental  data  set  used  for  deriving  the  OB/DG  equation  and  a  test  data 
set  used  for  its  evaluation,  but  the  test  data  are  not  Independent  in  a  true 
statistical  sense.  The  Green  Glow  data  are  Independent,  but  are  weighted 
towards  stable  conditions,  for  which  the  OB/DG  model  is  not  calibrated. 

Our  model  evaluation  results  are  summarized  in  Table  17,  where  the  number 
in  the  experiment  code  indicates  a  given  monitoring  arc,  and  n  equals  the 
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TABLE  17.  STATISTICAL  EVALUATION  FOR  PRAIRIE  GRASS  (PG).  GREEN  GLOW  (GG), 
OCEAN  BREEZE  (OB),  AND  DRY  GULCH  (DG)  EXPERIMENTS.  THE  NUMBER 
FOLLOWING  THE  SITE  CODE  IS  THE  MONITORING  ARC. 


FB 

FB 

NMSE 

NMSE 

DOES 

DOES 

EXPER 

n  AFTOX 

OBDG 

AFTOX 

OBDG 

D  FB=0 

DNMSE=0 

Pgl 

52  -0. 27 

-0.34 

0.49 

0.49 

yes 

— *» 

A 

yes 

Pg2 

52  -0.03* 

-0.11 

0.005 

0.098 

yes 

A 

yes 

A 

Pg3 

52  0.087 

-0.013 

0.062 

0.066 

no 

0 

yes 

A 

Pg4 

52  0.031 

-0.01* 

0. 14 

0.092 

yes 

0 

yes 

0 

PgS 

52  -0. 16* 

-0. 11* 

0.48 

0.42 

yes 

0 

yes 

0 

ggl 

24  0.06* 

-0.64 

0. 13 

0.83 

no 

A 

no 

A 

gg2 

24  0. 15* 

-0.45 

0.  17 

0.54 

no 

A 

no 

A 

gg3 

24  -0. 13* 

-0.5 

0.39 

0.56 

no 

A 

yes 

A 

gg4 

24  -0. 4 

-0.5 

0.68 

0.57 

yes 

A 

yes 

A 

ggS 

24  -0.8 

-0.33 

4. 1 

0.68 

no 

0 

yes 

0 

gg6 

24  -0.91 

-0. 1* 

5.5 

0.5 

no 

0 

no 

0 

ggall 

144  -0.44 

-0.44 

2.74 

0.63 

no 

no 

0 

obi 

• 

• 

67  -0. 1 

-0.01 

0.39 

0.3 

yes 

0 

yes 

0 

ob2 

67  -0.37 

-0.09* 

0.9 

0.63 

no 

0 

no 

0 

oball 

134  -0.24 

-0.05 

0.68 

0.47 

yes 

0 

no 

0 

• 

• 

dgbl 

45  -0.07 

-0. 11 

0.31 

0.2 

yes 

A 

yes 

0 

dgb2 

45  0.2* 

0.3 

0.43 

0.23 

yes 

A 

yes 

0 

dgball 

90  0. 058 

0.077 

0.36 

0.21 

yes 

A 

yes 

0 

dgdl 

51  -0.08* 

-0.33 

0.22 

0.3 

no 

A 

yes 

A 

dgd2 

51  -0.26 

-0.36 

0.36 

0.3 

yes 

A 

yes 

0 

dgdal 1 

102  -0. 18 

-0.35 

0.3 

0.3 

yes 

A 

yes 

•  Indicates  that  FB  Is 

not  slgnlf lcantly  different 

from 

0.0 

(95  percent  c. 1. ) 

•*  Indicates  which  model  had  a  lower  FB  or  M  (A*AFTOX  and  OOBDG) 
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number  of  data.  There  Is  not  a  clear  distinction  between  the  models,  with  the 
AFTOX  model  performing  better  on  about  as  many  data  sets  as  the  OB/DG  model. 

It  is  .ifficult  to  decide  how  to  combine  all  the  various  results  in  the  table. 
In  fact,  if  one  counts  only  those  data  sets  where  one  model  showed  a 
significantly  better  FB  or  NMSE,  then  the  "score**  is  AFTOX  11  to  OB/DG  13.  It 
is  concluded  that  the  AFTOX  and  OB/DG  models  are  not  signif leant ly  different 
when  compared  to  these  data  sets. 

The  OB/DG  model  is  known  (Reference  30)  to  underpredict  (negative  FB)  for 
stable  conditions,  and  the  AFTOX  model  was  developed  with  the  intent  of 
correcting  this  deficiency.  However,  looking  at  the  results  from  the  six 
Green  Glow  monitoring  arcs,  it  is  seen  that  an  improvement  has  been  made  only 
at  the  inner  three  monitoring  arcs.  By  the  fourth  arc,  the  models  each 
underpredict  by  about  50  percent,  and  by  the  sixth  arc,  the  AFTOX  model 
underpredicts  by  90  percent  and  the  OB/DG  model  is  fairly  close 
(underprediction  of  only  10  percent).  Kunkel  (Reference  2)  also  mentions 
these  problems.  Further  enhancements  of  the  AFTOX  model  are  needed  for  these 
large  downwind  distances. 

C.  CHARM  APPLIED  TO  FLAT  TERRAIN  EXPERIMENT 

As  part  of  the  model  evaluation  procedure,  the  Air  Force  toxic  model, 
AFTOX,  and  the  CHARM  heavy  gas  diffusion  model,  were  tested  using  the  Ocean 
Breeze  passive  tracer  tests.  CHARM  was  run  "as  is"  (a  necessity  since  the 
c-vJe  is  proprietary).  All  data  were  input  manually.  Since  the  Ocean  Breeze 
study  used  zinc  sulfide  as  the  tracer  and  CHARM  did  not  model  this  substance, 
ethane  was  chosen  from  the  chemical  list  because  it  is  fairly  nonreactive  and 
close  to  the  molecular  weight  of  air.  Model  inputs  were  as  close  to  the 
actual  conditions  as  possible.  It  was  thought  that  a  good  test  of  CHARM’s 
ability  to  model  passive  releases  would  result  from  using  ethane  with  no 
appreciable  release  speed.  The  model  was  allowed  to  choose  the  stability 
class.  CHARM  chose  Pasqui 11-Gif ford  class  C;  class  D  was  reported  in  Ocean 
Breeze. 

The  "user  supplies  description"  option  was  used  to  characterize  the 
release.  The  tracer  storage  pressure  was  set  slightly  higher  than  the  ambient 
pressure  to  supply  the  needed  force  for  the  release  to  occur.  The  release  was 
continuous  in  nature  with  a  constant  emission  rate.  The  release  was  of  30 
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minutes  duration  and  concentration  values  *»ere  output  for  the  downwind 
distances  of  interest.  Comparisons  of  CHARM  and  AFTOX  predictions  with  the 
measurements  for  Runs  2  and  4  of  Ocean  Breeze  show  that  AFTOX  is  much  closer 
to  the  measurements  than  CHARM  but  that  both  models  overpredict.  CHARM  is 
seen  to  overpredict  by  a  factor  of  two  to  four  (see  Table  18).  Confidence 
limits  were  not  calculated  since  the  number  of  data  is  so  small. 

TABLE  18.  COMPARISON  OF  AFTOX  AND  CHARM  USING  DATA  FROM  PROJECT  OCEAN  BREEZE 


RUN 

* 

DISTANCE 

(m) 

MEASURED 
( mg/m3 ) 

AFTOX 

(mg/ra3) 

CHARM 

(mg/ra3) 

2 

1207 

0.0095 

0.01 

0.043 

2 

2414 

0.023 

0.037 

0.064 

4 

1207 

0.011 

0.013 

0.043 

4 

2414 

0.0031 

0. 0048 

0. 0064 

D.  AFTOX,  SAFER,  AND  TRACE  MODELS  APPLIED  TO  THORNEY  ISLAND  TRIAL  7  DATA 

Data  from  the  Thorney  Island  dense  gas  experiments  were  also  used  in  the 
model  evaluation  tests.  In  our  application  of  the  AFTOX  model,  Thorney  Island 
Trial  7  was  modeled  as  *n  Instantaneous  release  of  Freon-12,  which  was 
included  in  the  AFTOX  chemical  list.  Output  option  3  was  chosen  and  used  to 
determine  the  maximum  concentration  and  the  distance  downwind  at  which  it 
occurred  at  various  times.  Comparisons  of  AFTOX  predictions  with  those  of 
SAFER  and  TRACE  are  shown  in  Table  19.  All  of  the  models  overpredict;  SAFER 
by  a  factor  of  2  for  both  distances  shown,  TRACE  by  a  factor  of  3,  and  AFTOX 
by  a  factor  of  5.  The  reason  for  the  overprediction  of  the  AFTOX  model  (which 
considers  the  tracer  cloud  to  be  non  buoyant)  with  respect  to  the  SAFER  and 
TRACE  models  (which  account  for  dense  gas  effects)  is  that  the  AFTOX  model 
assumed  that  the  puff  is  always  transportd  by  the  wind  speed.  In  contrast, 
the  SAFER  and  TRACE  models  assume  that  the  puff  is  only  slowly  accelerated  up 
to  the  full  wind  speed.  Even  though  the  puff  volume  increases  more  rapidly 
with  time  in  the  AFTOX  model,  the  AFTOX  puff  reaches  a  given  downwind  distance 
much  faster  than  the  SAFER  or  TRACE  puffs. 


TABLE  19.  MODEL  COMPARISON  USING  DATA  FROM  THORNEY  ISLAND  TRIAL  7. 


DISTANCE 

(m) 

MEASURED 

(ppm) 

AFTOX 

(ppm) 

SAFER 

(ppm) 

TRACE 

(ppm) 

240 

7000 

33000 

12000 

19000 

407 

3800 

11508 

6900 

7600 
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SECTION  VI 


CONCLUSIONS  AND  RECOMMENDATIONS 


This  preliminary  study  has  touched  upon  a  large  number  of  issues  related 
to  the  estimation  of  hazard  response  model  uncertainty.  The  results  of  this 
research  are  summarized  below  and  recommendations  for  further  research  are 
given. 

A.  CONCLUSIONS  FROM  PRELIMINARY  APPLICATION 

The  intent  of  this  project  was  to  develop  and  test  a  quantitative  method 
for  assessing  the  uncertainty  of  hazard  response  models.  At  the  beginning  of 
the  project  it  was  not  obvious  that  it  would  be  possible  to  develop  a 
generalized  framework  for  this  purpose.  However,  the  different  components 
have  pulled  together  reasonably  well  and  there  is  hope  for  a  satisfactory 
completion  of  Phase  II.  The  specific  conclusions  listed  below  follow  the 
outline  of  this  report. 

1.  Literature  Review;  Acquisition  of  Models,  Data  Sets,  and  Model 
Predictions. 

The  available  literature  on  hazard  response  models,  evaluations, 
£nd  field  studies  was  reviewed  in  order  to  develop  an  awareness  of  previous 
work  on  this  subject.  Previous  work  on  hazard  response  model  evaluations  had 
been  limited  to  simple  comparisons  and  did  not  account  for  confidence  limits 
on  the  performance  measures  calculated.  However,  a  more  complete  framework 
was  available  from  other  reports  on  air  quality  model  evaluations  for  the  EPA 
and  others.  Consequently,  procedures  do  exist  for  accounting  for  confidence 
limits,  stochastic  fluctuations,  data  input  uncertainties,  and  model  physics 
errors. 


Air  Force  hazard-response  model  evaluation  reports  were 
reviewed,  including  the  older  OB/DC  model  reports,  the  newer  AFTOX  model 
development,  the  NgO^  experiments,  the  dense  gas  model  development  by  Raj,  the 
data  review  by  Ermak,  the  model  sensitivity  studies  by  Carney,  the  revisions 
of  DEG AD IS  by  Spicer  and  Havens,  the  applications  of  CHARM,  and  the  model 


89 


comparisons  by  Key  and  his  coworkers. 


Hazard  response  models  were  collected.  Installed  on  our 
microcomputers,  and  tested.  These  models  Include  OB/DG,  AFTOX,  CHARM,  OME, 
INPUFF,  AVACTI-I I ,  MADICT,  RVD,  D2PC,  SPILLS  and  SLAB. 

Sets  of  predictions  by  various  models  for  various  data  sets  were 
collected  and  archived.  These  include  the  OB/DG  and  AFTOX  predictions  for  the 
Prairie  Grass,  Green  Glow,  Ocean  Breeze  and  Dry  Gulch  experiments,  and  several 
dense  gas  model  (for  example,  SAFER,  CHARM,  DEGADIS)  predictions  for  data  sets 
such  as  Thorney  Island  and  Mapl In  Sands. 

Comprehensive  data  sets  for  several  experiments  were  obtained 
and  archived  on  our  microcomputers.  These  data  sets  include  source  terms  and 
observed  meteorological  parameters  for  each  time  period  during  the  experiment. 

2.  Analysis  of  Components  of  Model  Uncertainty 

A  method  for  decomposing  the  total  model  uncertainty  or  mean 
square  error  Into  its  three  components  (stochastic  fluctuations,  data  Input 
errors,  and  model  physics  errors)  was  outlined. 

Stochastic  concentration  fluctuations  are  seen  to  be  the  result 
of  stochastic  turbulent  fluctuations  In  the  atmosphere.  The  variance  of  these 
fluctuations  can  be  predicted  using  analytical  methods  available  from  the 
literature.  The  ratio  of  the  standard  deviation  due  to  the  stochastic 
fluctuations  to  the  mean  of  the  observed  concentrations  ranges  from  about  0. 1 
to  1.0  for  observations  with  short-term  averaging  (on  the  order  of  one 
second).  A  formula  for  expressing  this  ratio  as  a  function  of  averaging  time 
Is  proposed  and  some  examples  given  of  Its  application. 

Data  Input  errors  are  significant  for  standard  Air  Force 
meteorological  Instrumentation.  Typical  Instrument  errors  are  about  10 
percent  at  a  minimum.  Even  In  research-grade  experiments  these  instrument 
errors  are  typically  5  to  10  percent.  These  errors  can  be  accounted  for  In 
analytical  procedures  for  assessing  model  sensitivity  to  Input  variability. 
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Model  physics  errors  are  difficult  to  estimate,  since  they  equal 

the  difference  between  the  total  model  error  and  the  sum  of  the  stochastic  and 

the  data  input  error  components.  A  method  was  developed  for  calculating  the 

contribution  of  various  model  components  to  the  uncertainty.  This  method 

Involves  the  calculation  of  the  total  variance  of  the  model  residuals  for 

various  combinations  of  model  components  (for  example,  the  following 

2 

combinations  were  tested:  Q,  1/u,  Q/u,  Q/ux,  Q Zicr.  <r  x  ),  etc.).  It  is  found 

v  0  2 

that  for  the  Prairie  Grass  field  data  sets,  the  combination  (<r,  <r  x  u) 
accounts  for  most  of  the  variance,  and  additions  of  further  ’Improvements’  In 
model  physics  do  not  reduce  the  total  variance  any  more. 

3.  Framework  of  Model  Evaluation  Procedure 

A  quantitative  model  evaluation  procedure  was  derived  and  a 
software  package  was  written  and  tested.  This  procedure  assumes  that  a  table 
exists  that  contains  a  listing  of  observations  of  concentrations  or  hazard 
corridor  lengths,  along  with  predictions  from  one  or  more  models  of  the  same 
quantity.  The  fundamental  model  performance  measures  that  are  suggested  are 

the  fractional  bias  FB  ■  (C  -  C  )/0.5*(C  +  C  )  and  the  normalized  mean 

P  op 

2  —  ~ 

square  error  NM5E  *  (C  -  C  )  /C  C  . 

o  p  op 

Confidence  limits  are  calculated  for  FB  to  determine  whether  It 
Is  significantly  different  from  zero  (that  is,  are  the  predictions  of  the 
model  significantly  different  from  the  observations?).  The  bootstrap 
resampling  procedure  is  used  to  calculate  the  confidence  limits.  In  addition 
confidence  limits  on  differences  in  FB  and  NMSE  between  models  are  calculated 
to  determine  whether  the  predictions  of  the  models  are  signif lcantly 
different. 


4.  Results  of  Preliminary  Application  of  Model  Evaluation  Procedure 

The  model  evaluation  procedure  described  above  was  applied  to 
four  sets  of  model  predictions  and  observations: 
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a.  Dense  gas  nodel  comparisons. 


A  Halted  set  of  data  was  available  from  the  Maplln 
Sands  and  the  Thorney  Island  dense  gas  dispersion  experiments.  Model 
predictions  were  also  available.  Because  the  data  sets  are  so  small  (n  =  4  or 
5),  the  confidence  limits  on  FB  and  NMSE  are  so  large  that  there  is  generally 
no  significant  difference  among  the  models,  although  visual  inspection  of  the 
preH let ions  would  suggest  that  one  model  appears  to  perform  better  than  the 
others. 


b.  AFTOX  and  OB/DG  model  comparisons. 

These  models  are  Intended  for  application  to  non 
buoyant  sources,  and  predictions  were  available  for  the  Prairie  Grass,  Green 
Glow,  Ocean  Breeze,  and  Dry  Gulch  data  sets.  In  each  case  the  sample  size  is 
about  50  to  100.  The  comparison  is  hampered  by  the  fact  that  the  OB/DG  model 
was  actually  derived  from  some  of  these  data  sets.  In  most  cases,  the  two 
model  predictions  are  significantly  different  from  each  other,  although  there 
is  not  a  clear  trend  towards  one  model  or  another.  The  OB/DG  model 
underpredicted  the  stable  Green  Glow  data,  but  the  AFTOX  model  shows  an 
improvement  over  the  OB/DG  model  for  the  Green  Glow  data  set  only  at 
monitoring  arcs  close  to  the  source.  At  the  farthest  monitoring  arcs  the 
OB/DG  model  predictions  are  significantly  better  than  the  AFTOX  model 
predictions. 


c.  CHARM  model  applied  to  non-buoyant  source. 

The  CHARM  model,  which  accounts  for  a  wide  variety  of 
source  types  (including  dense  gases)  was  applied  to  the  Prairie  Grass 
experiment,  where  the  source  was  non  buoyant.  It  is  found  that  the  CHARM 
model  is  conservative  by  a  factor  of  about  two  to  four.  These  predictions  are 
significantly  different  from  those  of  the  AFTOX  or  OBDG  models. 

d.  AFTOX  model  applied  to  dense  gas  experiment. 

The  AFTOX  nodel  does  not  apply  to  dense  gas  sources. 
However,  it  was  applied  to  a  Thorney  Island  dense  gas  run  to  determine  the 
typical  error  that  should  be  expected.  It  is  found  that  the  model 
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overpredicts  by  a  factor  of  four  or  five.  Apparently  the  overprediction  by 
the  AFTOX  model  is  due  to  the  fact  that  it  allows  the  puff  to  immediately 
assume  the  wind  speed  and  thus  be  transported  more  quickly  to  the  monitor 
locations. 

B.  RECOMMENDATIONS  FOR  FURTHER  STUDY. 

The  preliminary  conclusions  listed  above  suggest  that  a  more  extensive 
research  program  would  be  worthwhile.  It  is  recommended  that  this  research 
program  contain  the  following  components: 

1.  Archiving  of  Information 

Much  information  on  data  sets  and  model  predictions  was  acquired 
under  Phase  I  of  this  project.  A  comprehensive  search  for  further  data  sets 
and  model  predictions  should  be  undertaken  and  the  data  sets  Installed  in  our 
microcomputers.  This  would  Include  all  hazardous  gas  data  summarized  by  Ermak 
(Reference  40)  and  recent  field  tests  using  HF  and  other  toxic  gases.  In 
addition,  sets  of  model  predictions  would  be  acquired  from  model  developers 
for  as  many  of  these  experiments  as  possible.  Similar  data  would  be  obtained 
for  non  buoyant  experiments  (for  example,  Cabauw)  and  model  predictions. 

To  complete  the  tables  of  model  predictions  and  observations, 
models  will  be  run  for  the  missing  periods.  Several  of  these  models  are 
on-hand.  It  Is  hoped  that  the  following  models  can  be  acquired:  Raj’s  dense 
gas  model  developed  for  the  Air  Force  (ADAM),  the  PC  version  of  DEGADIS 
developed  for  the  API,  a  PC  version  of  SLAB  developed  for  the  Air  Force  and 
for  the  API,  and  a  PC  version  of  CAMEO.  Also  it  is  possible  that  some  models 
that  are  currently  proprietary  will  be  released  to  the  public  (for  example, 
SAFER,  EAHAP) . 

2.  Determination  of  Uncertainty  Components. 

A  start  was  made  in  determining  the  components  of  model 
uncertainty.  In  the  future,  Improved  estimates  of  typical  concentration 
fluctuations  for  hazardous  gases  can  be  made  using  data  from  field  experiments 
and  using  theoretical  concepts  suggested  by,  for  example,  Chatwin  (Reference 
74).  A  better  estimate  of  the  integral  time  and  distance  scales  must  be  made 
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In  order  to  calculate  the  effects  of  averaging  times  and  distances.  We  plan  to 
use  analytical  formulas  developed  by  Hanna  (Reference  89)  under  an  Army 
Research  Office  contract. 


The  Air  Force  does  not  have  a  good  Idea  of  the  data 
uncertainties  In  their  meteorological  Instruments.  It  Is  desirable  to  obtain 
better  estimates  of  this  uncertainty  through  a  field  program  In  which  (1)  two 
similar  Instruments  are  set  up  at  the  same  location;  (2)  a  high-quality 
baseline  Instrument  Is  set  up  next  to  the  Air  Force  Instrument;  (3)  several 
similar  Instruments  are  set  up  along  a  path  at  separations  of  10  meters,  20 
meters,  50  meters,  100  meters,  and  so  on;  (4)  wind  tunnel  and  environmental 
chamber  tests  of  Instruments  are  made;  and  (5)  manufacturer  QA/QC  procedures 
are  carefully  reviewed.  The  Impact  of  Individual  Input  data  errors  on  total 
model  uncertainty  can  be  Investigated  using  analytical  methods  suggested  and 
tested  by  Carney  (Reference  5). 

Further  application  of  the  variance  reduction  analysis  should 
take  place  using  the  comprehensive  data  sets  and  model  prediction  tables.  In 
this  way  It  can  be  determined  whether  a  given  model  (for  example,  AFTOX  or 
SAFER)  yields  any  Improvement  over  simple  models  such  as  C  *  Q/u.  The 
expected  Improvement  In  the  model  due  to  the  Inclusion  of  new  model  physics 
parameters  can  therefore  be  estimated. 

3.  Development  of  Model  Evaluation  Procedures. 

The  model  evaluation  procedures  described  above  are  reasonable 
and  can  produce  estimates  of  confidence  limits.  Future  studies  should  test 
further  candidate  model  performance  measures,  such  as  the  correlation 
coefficient  and  the  fraction  of  predictions  within  a  factor  of  two  of  the 
observations.  The  recent  review  by  Ermak  and  Merry  (Reference  6)  will  also  be 
useful  In  devising  more  reasonable  performance  measures.  In  addition, 
alternate  resampling  schemes  should  be  tested,  such  as  the  Jackknife  and  the 
multlhalver  (Reference  87).  The  accuracy  of  these  schemes  can  be  tested  using 
concocted  data  sets  from  a  known  parent  distribution  (for  example,  the  normal 
or  log-normal  distribution). 
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4.  Application  of  Procedure. 


A  few  applications  of  the  preliminary  model  evaluation  software 
have  been  made  and  suggest  that  it  is  possible  to  arrive  at  conclusions 
regarding  the  accuracy  of  a  given  model  or  whether  two  model  predictions  are 
significantly  different  from  each  other.  In  the  future  the  revised  model 
evaluation  procedures  should  be  tested  with  the  comprehensive  data  sets  to  be 
acquired.  Models  of  Interest  to  the  Air  Force  (ADAM,  AFTOX,  CHARM,  DEG ADIS, 
SLAB,  and  OB/DG)  will  be  Included  in  the  analysis.  After  this  application  is 
complete,  we  will  have  a  good  idea  of  the  typical  accuracy  of  hazard  response 
models  and  whether  any  one  model  is  better  than  another.  We  note  that  it  is 
important  that  these  decisions  should  be  made  using  independent  data  sets 
(that  is,  data  sets  that  were  not  used  in  the  development  of  the  model). 

5.  Software  Package. 

The  final  product  should  be  a  user-friendly  software  package 
that  can  easily  be  used  to  assess  the  uncertainty  of  hazard  response  models. 
This  package  will  be  easily  transportable  from  one  microcomputer  to  another 
and  can  be  understood  and  operated  by  Air  Force  scientists  and  engineers. 

6.  Confidence  Levels  of  Models. 

Once  experience  is  gained  with  the  model  evaluation  methodology, 
it  will  be  possible  to  provide  decisionmakers  with  models  that  Include 
confidence  intervals  as  well  as  basic  calculations. 
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APPENDIX  A  -  DETAILED  DISCUSSION  OF  FIELD  EXPERIMENTS 


A.  PROJECT  PRAIRIE  GRASS 

1.  Overview 

Project  Prairie  Grass  was  held  In  north-central  Nebraska  near  O’Neil 
In  the  summer  of  1956  (Reference  60).  Personnel  from  the  Massachusetts 
Institute  of  Technology,  the  Texas  A.  and  M.  Research  Foundation,  the 
University  of  Washington,  the  University  of  Wisconsin,  the  Air  Weather 
Services,  and  the  Air  Force  Cambridge  Research  Center  participated  In  this 
series  of  70  trials.  The  project  was  designed  by  Air  Force  personnel  at  the 
Air  Force  Cambridge  Research  Center.  The  primary  objective  was  to  determine 
the  rate  of  diffusion  of  the  continuously  emitted  S02  tracer  gas  as  a  function 
of  meteorological  conditions.  Secondarily,  It  was  hoped  that  Insight  would  be 
gained  into  turbulence  phenomena.  Releases  were  of  10  minutes  duration  and 
from  ground  level.  Measurements  were  made  at  50,  100,  200,  400,  and  800 
meters  downwind.  The  trials  were  made  over  flat  prairie  terrain  under  a 
variety  of  meteorological  conditions.  Approximately  half  of  the  trials  were 
conducted  during  unstable  daytime  periods  and  the  rest  were  held  at  night  with 
temperature  Inversions  present. 

2.  Site  Description 

The  experiment  site  was  located  about  5  miles  northwest  of  the  town 
of  O’ Nell  In  Holt  County,  Nebraska  (latitude  42  degrees,  29.6  minutes  north; 
longitude  98  degrees,  34.3  minutes  west).  Approximately  1  square  mile, 
designated  Section  14,  Township  29  North,  Range  11  West,  was  leased  for  the 
duration  of  the  experiment.  The  site  was  virtually  level  and  covered  with 
natural  prairie  grasses,  which  were  sowed  prior  to  the  start  of  the  tests  and 
grew  very  little  over  the  period.  There  was  an  unobstructed  view  for  miles 
with  no  distinct  horizon  visible  except  to  the  southeast  where  a  small  hill 
was  located.  A-C  power  was  available  from  lines  running  approximately 
perpendicular  to  the  array  centerline  at  a  distance  from  the  release  point  of 
about  1  kilometer.  The  site  was  nearly  completely  obstruction  free.  The 
nearest  farmhouse  was  well  over  1300  meters  northwest  fo  the  release  point;  no 
complaints  were  made  about  the  gas  by  any  nonparticipants.  The  sampling  grid 
was  located  on  semicircular  arcs  at  the  aforementioned  downwind  distances  at 
bearings  between  180  and  270  degrees.  This  array  setup  was  chosen  primarily 
because  the  wind  climatology  Indicated  that  the  wind  was  from  between  120  and 
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240  degrees  more  than  50  percent  of  the  time  In  July  and  August.  The  soil  was 
composed  of  a  black  top  soil  about  25  centimeters  thick,  a  brown  subsoil  about 
20  centimeters  thick,  and,  beneath  this  brown  subsoil,  a  light  brown  layer  of 
compacted  soil  about  15  centimeters  in  depth.  The  top  soil  contained  about  4 
percent  organic  material.  Both  the  top  soil  and  the  subsoil  had  good 
water-holding  capacity.  Beneath  the  compacted  layer  was  a  loose,  coarse  sand 
of  about  60  centimeters  depth.  Water  held  in  the  sandy  layer  affected  the 
surface  vegetation  only  very  slowly  since  very  few  roots  penetrated  so  far  and 
upward  water  movement  was  quite  slow  through  the  sand  and  compacted  soil. 

3.  Experimental  Design 

The  project  was  designed  with  several  points  in  mind,  notably  the 
improved  understanding  of  general  turbulence  theories,  the  testing  of  specific 
diffusion  theories,  and  an  attempt  to  experimentally  verify  both  past  and 
(then)  present  theories.  These  points  helped  determine  what  types  of 
meteorological  measurements  were  necessary.  It  was  hoped  that  related 
problems  ranging  from  crop  dusting  to  the  forecasting  of  low  level  wind  shear 
would  be  brought  nearer  to  solution  by  the  results  of  the  project.  The  tracer 
technique  used  was  designed  by  M.  I.T.  at  Its  Round  Hill  Field  Station.  The 
technique  involved  the  continuous  emission  of  sulfur  dioxide.  A  continuous 
source  was  chosen  for  the  following  reasons:  first,  continuous  sources  are 
more  easily  reproduced  for  projects  comprised  of  many  trials,  second. 
Interpretation  of  the  concentration  data  is  somewhat  easier,  and  third,  the 
question  of  what  meteorological  data  is  pertinent  Is  usually  simpler.  The 
duration  of  the  releases  was  10  minutes.  This  was  chosen  after  consideration 
of  factors  such  as  the  cost  of  the  tracer  gas,  distance  between  samplers,  and 
practical  rates  of  emission.  It  was  highly  desirable  that  tracer  losses  on 
vegetation  and  the  ground  be  negligible,  at  least  in  the  sampling  area. 

Another  necessity  was  that  the  sampler  analysis  be  accurate,  cover  a  wide 
range  of  concentrations,  and  be  done  In  a  timely  fashion. 

4.  Release  Technique 

A  continuous  point  source  of  sulfur  dioxide  located  near  ground  level 
was  used  in  Project  Prairie  Grass.  SOg  was  chosen  as  the  tracer  in  part 
because  It  was  relatively  Inexpensive  and  easily  available.  Liquid  SC>2  was 
vaporized  In  a  specially  constructed  chamber  immersed  in  hot  water  in  a  large 
circular  tank.  The  water  was  maintained  at  a  temperature  of  50  degrees 
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Celsius;  it  provided  the  needed  thermal  energy  to  insure  that  the  emission 
rate  remained  constant  (the  rate  would  otherwise  drop  off  as  the  pressure  fell 
due  to  the  rapid  cooling  that  occurred  during  the  phase  change).  The  gas  then 
flowed  through  a  pressure  regulator  and  an  adjustable  flow-controller  valve. 
The  pressure  and  temperature  of  the  gas  were  measured  at  the  inlet  and  the 
outlet  to  aid  in  an  accurate  determination  of  the  source  strength.  This 
apparatus  was  partly  buried  in  a  trench  to  reduce  its  effects  on  the  natural 
air  flow.  The  tracer  was  emitted  horizontally  from  a  2  inch  plastic  pipe  at  a 
height  of  46  centimeters.  During  the  daytime  releases,  the  maximum  source 
strength  of  100  grams  per  second  was  used.  The  emission  rate  varied  less  than 
5  percent  in  almost  all  of  the  releases. 

5.  Sampling 

Midget  lmplngers  from  the  Mine  Safety  Appliance  Company  were  used  to 
make  the  measurements.  Each  implnger  contained  10  milliliters  of  dilute 
hydrogen  peroxide  solution.  Air  drawn  into  the  lmplngers  via  aspiration  by 
vacuum  units  was  broken  into  very  small  bubbles  as  it  Impacted  upon  the  bottom 
of  the  glass  flasks.  S02  in  the  air  would  react  with  the  hydrogen  peroxide  to 
form  sulfuric  acid.  The  lmplngers  were  mounted  on  steel  fence  posts  at  a 
height  of  1.5  meters  along  5  semicircular  arcs.  The  steel  posts  were  at  2 
degree  intervals  along  the  50,  100,  200,  and  400  meter  downwind  arcs  and  at  1 
degree  intervals  along  the  800  meter  arc.  Average  S02  concentrations  were 
also  determined  In  the  vertical  at  the  100  meter  arc.  Lightweight 
television-type  towers  were  spaced  at  14  degree  intervals  and  instrumented  at 
9  levels;  0.5,  1,  1.5,  2.5,  4.5,  7.5,  10.5,  13.5,  and  17.5  meters.  The 
analysis  of  the  collected  lmplngers  was  accomplished  by  measuring  the 
electrical  conductance  of  the  aspirated  solutions  using  Wheatstone  bridges  and 
dipping  conductivity  cells.  Each  implnger  was  placed  in  a  constant 
temperature  water  bath.  When  the  temperature  stabilized  at  the  appropriate 
value,  the  conductivity  cell  was  placed  in  the  implnger  and  the  resistance 
indicated  on  the  Wheatstone  bridge  was  recorded.  The  conductance  values  were 
converted  to  gas  concentrations  using  well  known  laboratory  techniques.  The 
uncertainty  Involved  in  determining  the  conductance  was  less  than  2  percent  in 
the  normal  range  of  concentrations. 

6.  Meteorological  Data 

Cup  anemometers  and  airfoil  type  vanes  were  used  to  measure  wind 
speed,  direction,  and  fluctuations  in  wind  direction  at  a  height  of  2  meters 
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at  2  locations:  one  pair  of  Instruments  at  the  release  point  (one  to  each 
side  of  the  source  and  about  25  meters  away  from  it)  and  another  pair  450 
meters  downwind  about  30  meters  west  of  the  centerline.  The  two  cup 
anemometers  used  were  chosen  from  a  group  of  11  similar  devices  on  the  basis 
of  field-matching  tests  -  an  average  difference  of  only  about  0.25  percent  in 
calibration  was  found.  Wind  speed  data  was  recorded  on  Esterllne-Angus  chart 
recorders.  The  balsa  wood  airfoil  vanes  used  to  measure  wind  direction  and 
direction  fluctuations  became  deformed  due  to  exposure  to  the  wind  and  the 
rain.  These  devices  were  replaced  after  trial  34  with  vanes  made  with  flat 
metal  plates.  The  instrumentation  was  operated  for  20  minute  sampling  periods 
that  were  centered  on  the  midpoint  of  the  10-mlnute  gas  releases.  Mean  wind 
speed,  direction,  and  standard  deviations  of  direction  were  calculated  both 
for  the  20  minute  sampling  periods  and  the  10-mlnute  release  periods.  The 
wind  speed  data  was  thought  to  be  accurate  to  within  2  to  5  percent  for  mean 
wind  speeds  greater  than  2  meters  per  second.  The  uncertainty  was 
significantly  larger  for  lower  mean  wind  speeds,  such  as  would  occur  at  night 
under  stable  atmospheric  conditions  (the  starting  speed  of  the  cup  anemometers 
was  0.8  meters  per  second).  The  wind  direction  values  may  be  in  error  by  up 
to  10  degrees.  Standard  deviations  of  the  wind  direction  were  considered 
accurate  to  within  10  percent  except  when  the  mean  wind  speed  was  less  than  2 
meters  per  second. 

The  Texas  A  and  M  group  had  a  variety  of  instrumentation  located 
about  825  meters  downwind  of  the  source  and  225  meters  west  of  the  array 
centerline.  These  instruments  Included  air  sampling  tubes,  thermocouples, 
soil  temperature  elements,  cup  anemometers  and  wind  vanes,  a  pyrhel lometer  to 
measure  incoming  shortwave  radiation,  and  a  net  radiometer.  Wind  vanes  were 
located  at  heights  of  1  and  16  meters,  anemometers  were  sited  at  heights  of 
0.25,  0.5,  1,  2,  4,  8,  and  16  meters,  thermocouples  and  air  samplers  were 
placed  at  heights  of  0.125,  0.25,  0.5,  1,  2,  4,  8,  and  16  meters,  and  the  soil 
temperature  elements  were  placed  at  0.031225,  0.0625,  0.125,  0.25,  0.5,  and  1 
meter  below  the  surface.  The  measurements  made  by  these  devices  were  used  to 
evaluate  the  latent  and  sensible  heat  fluxes.  Rawlnsonde  flights  were  made  by 
the  6th  Weather  Squadron  (Mobile),  Tinker  Air  Force  Base,  Oklahoma.  Flights 
were  made  for  all  trials  except  35  S  and  48  S.  Computations  were  made 
according  to  standard  Air  Weather  Service  procedures.  Pressure,  height, 
temperature,  and  relative  humidity  values  were  tabulated  for  the  significant 
and  mandatory  levels.  Wind  values  were  Included  for  the  standard  flights. 
Aircraft  soundings  were  also  taken  at  the  times  of  the  diffusion  trials.  The 


data  consisted  of  height,  pressure,  temperature,  relative  humidity,  vapor 
pressure,  and  dew  point  temperature.  A  standard  U. S.  Air  Force  L-20  that  had 
been  Instrumented  at  Hanscom  Air  Force  Base,  Bedford,  Massachusetts,  was  used. 
Flights  were  made  for  all  trials  except  23,  24,  31,  32,  33,  and  34. 

B.  PROJECT  GREEN  GLOW 

1.  Overview 

The  Green  Glow  series  of  tests  were  so  named  because  the  zinc  sulfide 
particles  used  as  a  tracer  give  off  a  green  fluorescence  under  ultraviolet 
light  sources  (Reference  61).  The  primary  objective  of  the  field  study  was  to 
calculate,  as  functions  of  the  meteorological  conditions,  the  horizontal  and 
vertical  diffusion  rates  of  the  particulate  tracer.  Measurements  were  to  be 
made  by  a  dense  sampling  grid  over  as  great  a  distance  as  possible  (It  was 
hoped  out  to  16  miles).  Green  Glow  was,  in  some  ways,  an  extension  of  Project 
Prairie  Grass.  Measurements  were  made  at  the  same  heights  above  ground  level, 
1.5  meters,  and  at  2  of  the  downwind  distances  used  In  Prairie  Grass,  200  and 
800  meters.  Other  measurement  distances  were  1600,  3200,  12800,  and  25600 
meters.  26  trials  were  done  on  the  Hanford  reservation  of  the  U.  S.  Atomic 
Energy  Commission  near  Richland,  Washington  in  1959.  All  trials  were 
conducted  at  night  over  slightly  rolling  terrain.  In  addition  to  horizontal 
measurements  of  the  tracer,  vertical  distributions  were  also  measured  at  the 
first  4  arcs.  Green  Glow  was  a  Joint  program  designed  by  personnel  at  the 
Hanford  Laboratories  of  General  Electric  and  the  Geophysics  Research 
Directorate  of  the  Air  Force  Cambridge  Research  Laboratories.  Site 
preparation,  equipment  procurement  and  Installation,  making  the  measurements, 
and  reducing  the  data  were  the  responsibility  of  the  General  Electric 
personnel.  Air  Force  personnel  helped  coordinate  the  efforts  of  the  various 
participants  and  aided  In  the  measurement  and  data  reduction  phases. 

2.  Site  Description 

The  Hanford  reservation  Is  located  in  south-central  Washington, 
approximately  30  miles  east  of  Yakima  and  125  miles  southwest  of  Spokane.  The 
reservation  is  surrounded  on  all  sides  by  elevated  terrain,  varying  from  3500 
feet  on  Its  southern  border  with  the  Rattlesnake  Hills  to  about  1100  feet  on 
the  eastern  rim  of  the  basin.  There  are  several  major  breeches  in  the  basin 
sides  -  the  Beverly  Gap  on  the  northwest  side,  the  Ringold  Coulee  on  the  east 


110 


side,  the  broad  valley  to  the  southeast,  and  to  the  south,  the  Benton  City 
Cap.  These  openings  channel  the  air  flow  into  and  out  of  the  basin  and  also 
lead  to  mountain-valley  circulations.  The  topography  usually  causes  a 
drainage  flow  over  the  central  part  of  the  Hanford  reservatilon  from  the 
northwest  to  the  southeast.  The  sampling  grid  was  located  on  the  valley  floor 
with  the  baseline  approximately  parallel  to  the  major  ridges.  Vegetation 
consisted  of  desert  grasses  Interspersed  with  sagebrush  1  to  2  meters  in 
height.  The  locale  was  quite  flat,  dropping  only  300  feet  over  16  miles. 

3.  Experimental  Design 

Planning  was  guided  by  the  decision  to  tie  in  Project  Green  Clow  with 
Project  Prairie  Crass,  by  the  peculiarities  of  the  Hanford  Fluorescent  Tracer 
System,  by  the  topography  of  the  site,  and  by  the  economics  and  logistics  of 
the  situation.  These  factors  gave  rise  to  the  following  guidelines: 

a.  Sampling  arcs  would  be  concentric  about  the  source. 

b.  Two  arcs  would  be  200  and  800  meters  from  the  source. 

c.  The  release  would  be  from  ground  level. 

d.  The  maximum  release  rate  would  be  8  kilograms  per  hour. 

e.  The  count lng-stat 1st lcs-based  assaying  system  used  would 
require  a  minimum  of  about  100  particles  per  sample. 

f.  The  range  of  the  sample-assaying  system  was  only  5  orders  of 
magnitude. 

g.  To  provide  the  necessary  accuracy  in  arcwise  dispersion 
estimates,  the  centerline  dosages  would  have  to  be 

at  least  100  times  the  minimum  significant  count. 

h.  The  tracer  would  deposit  on  the  surface  and  or  vegetation  at 
an  unknown  rate. 

1.  Releases  would  be  of  30  minutes  duration. 

Ill 
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J.  Dosage  rather  than  average  concentration  would  be  measured. 


k.  All  trials  would  be  at  night  under  stable  atmospheric 
conditions. 

l.  Samples  would  be  collected  at  a  height  of  1.5  meters. 

The  major  design  tasks  were  to  specify  the  geometry  of  the  sampling 
network,  the  sampler  spacing,  and  the  rates  of  sampling.  These  parameters  had 
to  acknowledge  the  constraints  imposed  by  the  aforementioned  guidelines.  It 
must  be  noted  that  the  design  remained  flexible  throughout  the  project  so  that 
encountered  problems  could  be  dealt  with  quickly;  errors  or  oversights  were 
caught  by  performing  a  continual  qualitative  check  of  the  data. 

4.  Release  Technique 

Two  standard  Todd  Insecticidal  Fog  Applicators  dispensed  the  tracer. 
These  devices  are  aerosol  fog  generators  that  consist  of  an  air  blower,  a 
combustion  chamber,  a  formulation  pump,  and  a  gasoline  engine.  The 
formulation  used  consisted  of  zinc  sulfide  pigment  mixed  with  sodium  lauryl 
sulfate  (a  surface  active  agent)  and  a  small  amount  of  water.  Special  care 
was  taken  to  Insure  that  the  pigment  was  of  a  uniform  size  distribution.  Each 
fog  generator  output  about  20  grams  per  hour  and  this  rate  was  more  or  less 
constant. 


5.  Sampling 

The  samplers  used  In  Green  Glow  consisted  of  a  membrane  filter 
contained  by  a  disposable  polyethylene  filter  holder.  They  were  bulk 
samplers,  Intended  to  collect  all  particles  in  the  Intake  zone.  A  new  set  of 
sampling  units  was  used  for  each  trial.  The  bulk  samplers  were  assayed  by 
different  means,  depending  on  whether  or  not  there  was  a  significant  amount  of 
dust  on  the  filter.  If  the  filters  appeared  relatively  free  from  dust, 
assaying  was  accomplished  by  use  of  a  Rankin  counter.  Rankin  counters  use  a 
radioactive  isotope  (plutonium)  to  activate  the  fluorescent  pigments  on  the 
filters  via  alpha  bombardment.  The  scintillations  were  viewed  and  counted  by 
a  multiplier  phototube.  Background  counts  on  the  Rankin  devices  were  fairly 
low  -  between  2  and  8  counts  per  minute.  In  cases  where  dust  contamination 
was  present,  a  Trl-Carb  liquid  scintillation  spectrometer  was  used.  The 
samples  were  treated  with  a  solvent  composed  of  3  parts  ethyl  acetate  and  1 
part  ethyl  alcohol.  As  nearly  all  the  airborne  particles  were  Insoluble, 
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including  the  tracer,  a  suspension  of  tracer  and  contaalnants  was  produced 
after  agitation.  The  sample  was  irradiated  by  the  spectrometer  and  then  the 
phosphorescence  was  counted  with  a  multiplier  phototube.  To  reduce  the 
background  counting  level,  the  phototube  system  was  placed  In  a  deep  freeze. 

6.  Meteorological  Data 

Meteorological  data  was  collected  on  the  instruaented  410  foot  tower, 
on  a  portable  78  foot  mast,  from  the  Hanford  radio-telemetering  network,  and 
from  rawlnsonde  launches  made  by  personnel  from  the  6th  Weather  Squadron 
(Mobile),  Tinker  Air  Force  Base,  Oklahoma.  Measurements  on  the  410  foot  tower 
were  made  at  heights  of  3,  7,  50,  100,  150,  200,  250,  300,  and  400  feet.  Wind 
speed  and  direction  were  measured  at  all  levels  except  at  3  feet,  temperature 
at  all  levels  except  7  feet,  and  dew  point  temperature  at  all  levels  except  7, 
150,  and  250  feet.  Aspirated  copper  thermo  has  were  used  to  measure 
temperature,  Foxboro  Dew  Cells  were  used  to  measure  dew  point  temperature,  and 
Frlez  Aerovanes  were  used  to  measure  the  wind  velocity.  All  information  was 
recorded  on  strip  charts.  Wind  speed  and  temperature  were  measured  at  2.5,  5, 
10,  20,  40,  and  approximately  80  feet  on  the  portable  mast  provided  by  General 
Electric.  Wind  direction  was  measured  at  2.5,  10,  40.  and  80  feet  using 
Beckman  and  Whitley  indicators.  Wind  speed  was  measured  by  3  cup  anemometers 
from  C.F.  Case 11a  and  Company,  Limited,  that  had  been  modified  into 
photo-chopping  devices.  Thermocouples  were  used  to  measure  temperature.  All 
signals  were  recorded  on  a  single  strip  chart  recorder  using  a  scanning  unit 
from  Panne lit,  Inc.  Wind  speed  and  direction  were  measured  at  the  18  remote 
stations  comprising  the  Hanford  radio-telemetering  network.  The  central 
receiving  station  was  located  at  the  410  foot  tower,  near  the  release  point. 
Modified  Friez  Aerovanes  placed  at  at  height  of  23  feet  were  used  for  the 
measurements.  Standard  Air  Weather  Service  equipment  was  used  by  the  team 
from  the  6th  Weather  Squadron  to  make  the  rawlnsonde  observations.  Ascents 
were  made  1  hour  in  advance  of  each  release  time,  at  the  start  of  each 
release,  and  1  hour  after  each  release  time.  All  computations  were  done  by 
the  Weather  Squadron  personnel.  The  1  minute  angle  values  from  the  wind 
recorder  were  used  in  preparing  the  data  contained  in  the  project  report. 
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C.  PROJECT  OCEAN  BREEZE 


1 .  Overview 

Project  Ocean  Breeze  was  conducted  at  Cape  Canaveral,  Florida,  during 
1961  and  1962  (Reference  27).  Air  Force  personnel.  General  Electric  technical 
staff,  and  various  Air  Force  contractors  participated  in  the  setup  and  or 
operation  of  the  76  trials  that  coaprised  the  project.  The  trials  were 
conducted  both  In  the  s usaer  and  winter  seasons;  sea  breezes  occurred 
frequently  In  the  suaaer,  and  nuaerous  passages  of  cold  fronts  in  the  winter 
brought  about  unstable  conditions.  The  prlaary  objective  was  to  provide  the 
data  needed  to  develop  and  test  a  set  of  diffusion  prediction  equations  to  be 
used  operationally  at  the  Cape  Canaveral  alsslle  test  range.  Air  Force 
Cambridge  Research  Laboratory  was  also  determined  to  develop  and  install  an 
autoaated  Meteorological  data  acquisition  and  processing  system  on  base.  This 
systea  would  continually  output  solutions  to  the  developed  diffusion 
prediction  equations.  The  Hanford  Tracer  Systea  was  used  In  the  project;  Its 
features  were  factored  into  the  overall  project  design.  Zinc  sulfide 
particles  were  used  as  the  tracer.  The  release  point  was  placed  between 
launch  pads  15  and  16,  approximately  1000  feet  from  the  ocean.  Measurements 
were  made  at  a  height  of  IS  feet  at  concentric  arcs  located  0.75,  1.5,  and  3.0 
alles  downwind  of  the  source.  The  bulk-collecting  membrane  filters  were 
placed  at  15  feet  above  ground  level  because  of  the  vegetation,  which  was 
composed  of  palmetto  2-5  feet  tall  and  brushwood  7-14  feet  tall,  situated  on 
rolling  sand  dunes  10-20  feet  In  height. 

2.  Site  Description 

The  Cape  Canaveral  missile  range  Is  located  on  the  east-central 
Florida  coast.  Its  eastern  side  is  bordered  by  the  Atlantic  Ocean,  and  Its 
western  border  Is  the  Banana  River.  The  experiment  site  was  characterized  by 
10-20  feet  tall  rolling  sand  dunes.  In  addition,  much  of  the  diffusion  course 
was  covered  with  brushwood  and  palmetto  growth.  The  sampling  grid  was  located 
on  the  aforementioned  arcs  at  2  degree  Intervals  on  arcs  1  and  2  between  152 
and  340  degrees  with  respect  to  the  source,  and  at  1.5  degree  intervals  on  arc 
3  between  152  and  236.5  degrees.  The  orientation  of  arc  3  limited  its  use  to 
occurrences  of  northerly  winds,  which  occurred  fairly  often  during  the  winter 
months. 
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3.  Experimental  Design 


The  experiment  was  designed  to  be  of  use  In  determining  the  potential 
effects  of  accidental  toxic  releases  on  the  missile  range.  As  such,  the  most 
probable  accident  scenarios  were  considered  and  found  to  be  able  to  be 
characterized  by  a  continuous  point  source  located  at  ground  level.  Due  to 
the  large  variability  In  possible  emission  periods  and  to  the  cost  factor, 
releases  were  designed  to  be  of  30  minutes  duration.  The  parameters  used  to 
decide  upon  the  geometry  of  the  sampling  grid  were  climatological  data  for  the 
site,  terrain  features,  and  the  experience  of  Project  Green  Glow.  Some  of  the 
initial  criteria  used  In  the  design  process  were: 

a.  Sampling  would  be  done  along  logarithmically  spaced  arcs 
located  at  distances  determined  by  the  scale  of  the  problem. 

b.  The  release  would  be  from  ground  level  and  of  30  minutes 
duration. 

c.  The  maximum  release  rate  would  be  8  kilograms  per  hour. 

d.  The  sample  assaying  would  be  done  at  the  Hanford  Plant. 

e.  Dosages  rather  than  concentrations  of  the  tracer  would  be 
measured. 

4.  Release  Technique 

As  In  Project  Green  Glow,  the  zinc  sulfide  tracer  was  released  using 
two  standard  Todd  Insecticidal  Fog  Applicators.  These  aerosol  fog  generators 
consist  of  an  air  blower,  a  combustion  chamber,  a  formulation  pump,  and  a 
gasoline  engine.  The  formulation  used  was  composed  of  zinc  sulfide  pigment, 
the  surface  active  agent  sodium  lauryl  sulfate,  and  water.  Tracer  emission 
rates  were  calculated  by  subtracting  formulation  amounts  in  the  holding  tank 
at  the  end  of  each  run  from  the  amount  recorded  prior  to  each  run.  The 
effective  source  height  was  2-3  meters  above  ground  level. 

5.  Sampl lng 

The  samplers  used  were  the  same  as  those  used  In  Project  Green  Clow  - 
membrane  filters  In  disposable  polyethylene  holders.  These  bulk  samplers  were 


115 


used  only  once;  there  was  a  new  set  of  samplers  for  every  trial.  The  samplers 
were  assayed  using  a  Rankin  counter.  This  device  activates  the  fluorescent 
particles  embedded  in  the  filter  using  alpha  bombardment  brought  about  by  a 
200  microcurie  plutonium  source.  The  scintillations  that  occur  are  viewed  by 
a  multiplier  phototube,  amplified,  and  recorded  using  a  scaler.  The 
background  counting  rate  was  about  5  counts  per  minute,  which  was  equivalent 
to  approximately  4  E-9  grams  of  tracer.  Unlike  Project  Green  Glow,  dust 
contamination  was  not  a  problem. 

6.  Meteorological  Data 

Meteorological  measurements  were  made  by  personnel  at  Pan-American' s 
Cape  Weather  Station,  located  a  few  hundred  feet  downwind  from  arc  2  at  a 
bearing  of  196  degrees  from  the  source.  These  measurements  consisted  of  wind 
speed  and  direction  data  using  a  Belfort  Type  M  device  sited  at  a  height  of  12 
feet  above  ground  level  and  temperature  profiles  from  wiresonde  captive 
instrumented  balloons.  Standard  synoptic  and  rawlnsonde  data  were  provided  by 
Detachment  11,  4th  Weather  Group,  Air  Weather  Service,  Patrick  Air  Force  Base. 
The  Weather  Detachment  also  made  the  necessary  wind  direction  forecasts  used 
in  the  scheduling  of  each  diffusion  trial. 

D.  PROJECT  DRY  GULCH 

1.  Overview 

Project  Dry  Gulch  was  conducted  at  Vandenburg  Air  Force  Base, 
California,  during  1961  and  1962  (Reference  27).  U.S.  Air  Force  personnel. 
General  Electric  personnel  from  the  Hanford  complex  in  Washington,  and  various 
contractors  aided  in  the  setup  and  or  operation  of  the  109  trials  that 
comprised  the  project.  Preparation  of  the  diffusion  course  was  done  by  the 
Martin  Company,  training  of  the  field  crews,  scheduling  of  the  trials, 
furnlshment  of  the  tracer  and  sampling  filters,  and  tabulation  of  the  data  was 
accomplished  by  General  Electric.  Air  Force  personnel  made  meteorological 
measurements,  and  rawlnsonde  data  was  made  available  by  the  U.S.  Weather 
Bureau  station  at  Point  Arguello.  The  primary  objective  of  Dry  Gulch  was  the 
same  as  that  of  Project  Ocean  Breeze  -  the  development  and  testing  of 
diffusion  prediction  equations  from  the  experimental  data.  These  equations 
were  needed  for  use  in  situations  of  accidental  releases  of  toxic  gases  at  the 
missile  range.  The  Hanford  Tracer  System  was  used  at  Vandenberg.  This  system 
used  fluorescent  zinc  sulfide  particles  dispensed  by  aerosol  fog  generators  as 
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the  tracer.  Measurements  were  made  on  two  diffusion  courses  -  course  B  was 
located  on  the  mesa  that  was  200-300  feet  In  elevation  with  a  40-60  foot  bluff 
at  the  coastline,  course  D  was  located  in  the  Lompoc  Valley  which  runs  from 
west -northwest  to  east-southeast  along  the  southern  edge  of  the  aforementioned 
mesa.  All  samples  were  placed  at  a  height  of  1.5  meters  above  ground  level. 

The  source  point  for  course  B  was  about  2600  yards  inland;  the  source  point 

for  course  D  was  approximately  1100  yards  from  the  coastline.  Course  B  had  2 

concentric  sampling  arcs  which  were  1.43  (2301  meters)  and  3.52  (5665  meters) 

miles  downwind.  Course  D  had  3  arcs  located  0.53  (853  meters),  0.93  (1500 
meters),  and  2.93  (4715  meters)  miles  downwind. 

2.  Site  Description 

The  terrain  at  the  experiment  site  was  quite  complex.  It  consisted 
of  a  broad,  200-300  feet  tall  mesa  which  sloped  westward  and  terminated  in  a 
40-60  foot  bluff  at  the  coastline.  Several  sharp  ravines  cut  the  mesa  and  2 
valleys,  running  roughly  west-northwest  to  east-southeast,  denoted  the 
northern  and  southern  termini  of  the  mesa.  The  southern  valley,  the  Lompoc, 
is  quite  broad  with  gently  sloping  sides;  the  San  Antonio  Valley  to  the  north 
of  Burton  Mesa  is  fairly  narrow  with  relatively  steep  sides.  The  samplers 
were  placed  at  2  degree  Intervals  on  arc  B-l  from  87  to  171  degrees  with 
respect  to  the  source,  and  at  1  degree  intervals  on  arc  B-2  from  85  to  171 
degrees.  Samplers  were  located  every  2  degrees  from  60  to  180  degrees  on  arcs 
D-l  and  D-2,  and  every  l  degree  from  110  to  180  degrees  on  arc  D-3. 

3.  Experimental  Design 

As  stated  in  the  overview.  Project  Dry  Gulch  was  designed  to  aid  in 
the  determination  of  the  potential  effects  of  accidental  releases  of  toxics  on 
the  missile  range.  The  most  probable  accident  types  were  found  to  be  capable 
of  being  modeled  by  a  ground  level,  continuous  point  emission.  Since  actual 
emission  times  may  vary  significantly,  all  releases  were  designed  to  be  of  30 
minutes  duration.  There  was  a  great  deal  on  interest  in  sea  breeze 
conditions  and  the  resultant  diffusion  because  of  the  high  frequency  of 
occurrence  of  such  situations.  The  geometry  of  the  sampling  grid  was 
formulated  using  cllmotologlcal  data  for  the  site,  terrain  features,  and  the 
experience  of  similar  tests  in  Project  Green  Glow.  Some  of  the  criteria  used 
to  design  the  tests  were; 


a.  Sampling  would  be  done  along  logarithmically  spaced  arcs 
located  at  distances  determined  by  the  scale  of  the  problems 
considered. 

b.  The  release  would  be  from  ground  level. 

c.  The  release  would  last  for  30  minutes. 

d.  The  maximum  release  rate  would  be  8  kilograms  per  hour. 

e.  All  sample  assaying  would  be  done  at  the  Hanford  Plant. 

f.  Dosages  rather  than  time  averaged  concentrations  of  the 
tracer  would  be  measured. 

4.  Release  Technique 

Zinc  sulfide  was  released  using  2  Todd  Insecticidal  Fog  Applicators. 
These  devices  are  aerosol  fog  generators  that  consist  of  an  air  blower,  a 
combustion  chamber,  a  formulation  pump,  and  a  gasoline  engine.  The  actual 
formulation  used  in  the  fog  generators  was  composed  of  zinc  sulfide  pigment, 
the  surface  active  agent  sodium  lauryl  sulfate,  and  a  small  amount  of  water. 
Th«  effective  source  height  was  2  to  3  meters  above  ground  level.  Tracer 
emission  rates  were  calculated  from  the  measured  amounts  of  the  formulation 
recorded  before  and  after  each  trial. 

5.  Sampling 

Membrane  filters  fitted  into  disposable  polyethylene  holders  were 
used  as  the  sampling  devices.  The  filters  were  of  the  same  type  as  those  used 
in  Project  Green  Glow.  As  in  that  previous  experiment,  these  bulk  samplers 
were  used  only  once;  a  new  set  of  samplers  was  requireo  for  each  trial.  The 
samples  were  assayed  with  a  Rankin  counter  and  a  multiplier  phototube.  The 
counter  was  used  to  activate  the  embedded  fluorescent  particles  via  alpha 
bombardment.  The  excitation  caused  scintillations  that  were  dutifully  viewed 
by  the  phototube  then  amplified  and  recorded  by  a  scaler  device.  It  was 
thought  that  the  background  counting  rate  of  5  counts  per  minute  was  low 
enough  to  be  able  to  distinguish  a  goodly  range  of  tracer  concentration. 
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6.  Meteorological  Data 


Members  of  Detachment  3,  3rd  Weather  Wing,  Air  Weather  Service, 
provided  supporting  meteorological  measurements.  Belfort  Type  M  devices 
placed  12  feet  above  ground  level  were  used  to  measure  wind  speed  and 
direction.  For  the  first  29  trials,  measurements  of  temperature  differences 
(delta-T)  over  certain  heights  were  made  using  a  "gantrysonde".  This  not 
terribly  trustworthy  device  was  composed  of  wi resonde  instruments  mounted  on 
an  Atlas  rocket  gantry.  Wlresondes  replaced  the  "gantrysonde"  after  trial 
number  29.  The  U. S.  Weather  Bureau  rawlnsonde  station  at  Point  Arguello  made 
many  special  launches  and  detailed  calculations  of  the  temperature  and  wind 
profiles  up  the  700  millibar  pressure  level.  While  this  data  was  certainly  of 
use,  it  should  be  noted  that  the  rawlnsonde  launch  site  was  8  miles  south  of 
and  200  feet  higher  than  the  wlresonde  site. 

E.  THORN EY  ISLAND 

1 .  Overview 

Thorney  Island,  England,  was  the  site  of  16  unobstructed,  large-scale 
releases  of  a  heavy  gas  tracer  during  the  summers  of  1982  and  1983  (Reference 
47).  The  experiment  was  conducted  by  the  National  Maritime  Institute  under 
contract  to  the  United  Kingdom  Health  and  Safety  Executive  with  the 
sponsorship  of  numerous  international  organizations.  The  main  objective  was 
to  measure  and  archive  data  for  the  sponsoring  organizations  to  use  in  the 
analysis  and  testing  of  the  capabilities  of  various  computer  models.  To  this 
end,  much  effort  went  into  insuring  that  there  would  be  as  few  ambiguities  as 
possible  associated  with  the  data.  For  example,  the  mechanism  for  releasing 
the  gas  tracer  provided  well  defined  initial  conditions  for  model 
calculations.  Instantaneous  releases  of  2000  cubic  meters  of  a  gas  mixture  of 
Freon-12  and  nitrogen  were  accomplished  using  an  accordion-type  container. 
Relative  densities  with  respect  to  air  were  between  1.5  and  4.  Many 
meteorological  parameters  were  measured,  Including  wind  speed  and  direction, 
turbulence,  temperature,  humidity,  and  solar  radiation.  Concentrations  of  the 
dispersed  tracer  were  measured  using  gas  sensors,  most  with  a  frequency 
response  of  1  Hertz,  that  actually  recorded  oxygen  deficiency.  Thirty-eight 
fixed  masts,  instrumented  at  0.4,  2.4,  6,  and  10  meters,  were  placed  in  a  100 
by  100  meter  grid.  The  gas  sensors  were  actually  placed  at  heights  of  0.4, 
2.4,  4.4,  and  6  meters  above  ground  level.  The  wind  data  was  measured  at  10 
meters.  The  gas  measurements  were  averaged  over  0.6  second  Intervals  in  an 
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attempt  to  filter  out  sensor  response  noise  and  to  retain  the  shape  of  the 
sensor  responses. 

2.  Site  Description 

Thorney  Island  Is  located  near  latitude  51  degrees,  longitude  1 
degree.  It  is  the  site  of  an  abandoned  airfield;  the  runways  are  still  extant. 
The  experiment  site  was  flat  and  uniform  over  an  area  of  1  by  0.5  kilometers. 
Upwind  of  the  sampling  grid,  the  fetch  was  clear  for  1  kilometer.  The 
surface  was  grass  interspersed  with  tarmac  runways.  The  grass  was 
approximately  15  centimeters  In  height  and  the  roughness  length  determined  for 
the  locale  was  1  centimeter.  In  an  attempt  to  minimize  temperature 
differences  between  the  grass  and  the  runways,  all  portions  of  the  runway 
within  the  sampling  grid  were  painted  white. 

3.  Experimental  Design 

The  experiment  was  designed  to  provide  better  understanding  of  the 
hazards  involved  In  the  storage  and  transportation  of  heavier-than-alr  gases. 
Knowledgeable  Judgement  of  the  risk  involved  in  the  use  of  materials  such  as 
ammonia  and  liquified  natural  gas  requires  a  great  deal  of  information. 
Prominent  in  the  list  of  such  necessary  data  Is  the  diffusion  of 
heavler-than-alr  gases.  The  Thorney  Island  study  was  specifically  designed 
to  provide  data  on  the  dispersion  of  large-scale  Instantaneous  releases  of 
negatlvely-buoyant  gases  and  to  use  this  data  to  quantitatively  determine  the 
capabilities  of  various  classes  of  computer  models.  Thorney  Island  was  chosen 
as  the  test  site  primarily  because  it  met  the  meteorological,  topographical, 
and  logistical  requirements  (wind  direction  steady  for  goodly  periods  of  time 
or  for  predictable  intervals  based  on  the  synoptic  and  or  mesoscale  weather 
conditions,  flat  and  unobstructed,  easy  to  supply). 

4.  Release  Technique 

Tracer  production  was  carried  out  with  gas  fired  vaporizers.  The 
liquid  Freon-12  and  nitrogen  were  vaporized  then  mixed  in  the  gaseous  phase 
and  finally  pumped  to  the  release  site.  The  release  device  consisted  of  an 
Inflated,  flexible  fabric  cylinder  14  meters  in  diameter  and  13  meters  in 
height.  This  fabric  bag  was  attached  to  horizontal,  metal  rings,  each  of 
which  was  supported  by  cables  that  In  turn  were  attached  to  a  22  meter  tall 
central  tower.  The  cables  were  disconnected  at  release  times,  causing  the 


fabric  bag  to  fall  concertina-fashion.  The  collapse  of  the  bag  was  complete 
in  1  to  2  seconds,  leaving  a  free-standing  cylinder  of  gas.  Orange  smoke 
mixed  with  the  gas  acted  as  a  marker  for  the  photographic  records  of  the 
trials.  The  release  volume  was  determined  from  the  size  of  the  fabric  bag  and 
the  meter  readings  of  the  individual  gas  storage  tanks. 

5.  Sampling 

The  gas  monitoring  devices  sited  at  0.4,  2.4,  4.4,  and  6  meters  on 
each  of  the  38  fixed  and  4  mobile  masts  measured  oxygen  deficiency.  The  level 
of  oxygen  deficiency  was  related  to  the  amount  of  tracer  present  in  a  known 
manner.  One  hundred  and  eighty  sensors  had  a  1  Hertz  frequency  response,  5 
sensors  were  faster  in  response  (10  Hertz).  Data  was  acquired  using  35 
conversion  units,  each  of  which  provided  analog  to  digital  conversion  for  8 
data  channels.  Each  of  the  conversion  units  was  connected  to  a  central 
computer  which  collected  and  archived  the  digitized  data.  After  the  archiving 
of  the  data,  the  voltage  measurements  were  converted  to  engineering  units 
using  the  appropriate  scales.  Once  this  was  accomplished,  the  gas  sensor  data 
validated  to  check  the  gain  and  zero  shift  of  the  instruments.  The  data  was 
then  made  available  to  the  sponsors. 

6.  Meteorological  Data 

There  were  a  total  of  32  meteorological  instruments  in  use  during  the 
Thorney  Island  trials.  Ten  of  these  devices  were  three-dimensional 
anemometers.  Quantities  measured  Included  wind  speed  and  direction, 
temperature,  pressure,  relative  humidity,  and  solar  insolation.  Several 
methods  were  used  to  determine  the  Pasqui 11 -Gifford  stability  class  and  they 
produced  a  variety  of  values.  The  methods  used  were  delta-T,  the  indirect 
method  of  determining  the  sensible  heat  flux  proposed  by  Smith  in  1979  using 
solarimeter  data,  direct  measurement  of  the  sensible  heat  flux,  Richardson 
number,  bulk  Richardson  number,  and  sigma  theta.  Of  this  plethora  of  methods, 
the  sigma  theta  scheme  seemed  to  work  best.  The  delta-T  method  gave  a  result 
of  class  A  for  each  of  the  unobstructed  trials  (numbers  6-19).  The  direct  and 
indirect  heat  flux  methods  always  resulted  in  class  F.  With  such  variation 
between  methods,  it  required  careful  consideration  of  the  meteorological 
conditions  to  ascertain  the  atmospheric  stability. 
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